BBN  Laboratories  Incorporated 

A  Subsidiaiv  of  Bolt  Ber.inek  aiicl  Newman  Inc. 

AD-A187  355 


Report  No.  6636 


Research  in  Knowledge  Representation 
For  Natural  Language  Communication  and 
Planning  Assistance 

Annual  Report 

18  March  1986  to  31  March  1987 

B.  Goodman,  A.  Haas,  E.  Hinrichs,  H.  Kautz, 

L.  Polanyi,  J.  Schmolze,  M.  Vilain 


Prepared  for: 

Defense  Advanced  Research  Projects  Agency 


DTIC 

ELECTE 
NOV  0  4  1987  i 


O', 


H 


distributTon  staitmeot  a' 

Approved  for  public  relcos*; 
IXstrlbutlou  Uniiuiitixi 


Unclassified 


SeCuniTV  CLAttiriCATION  or  this  ^AGC  Dmim  Ent»t»4) 


REPORT  DOCUMENTATION  PAGE 

REAP  INSTRUCTIONS 

BEFORE  COMPLETING  FORM 

1.  NEAONT  NUMEEN  2.  OOVT  ACCEUION  NO. 

6636  Hi 

1.  NECIFIENT'S  catalog  numben 

r 

t.  riTLC  (nnB  Mini*) 

Research  in  Knowledge  Representation  for 

Natural  Language  Communication  and  Planning 
Assistance 

S.  TYFE  OF  NEPONT  A  PENIOO  COVENEO 

Annual  Report 

3/86  -  3/87 

t.  penforming  ONG.  REPORT  NUMBER 

6636 

7.  AUTMONf** 

B.  Goodman,  A.  Haas,  E.  Hlnrichs,  L.  Polanyi, 

H.  Kautz,  J,  Schmolze,  M.  Vllaln 

t.  CONTRACT  OR  grant  NUMBERltJ 

N0001A-85-C-0079 

1.  nenfonming  oncanization  name  ano  aooness 

BBN  Laboratories  Inc. 

10  Moulton  Street 

Cambridge,  MA  02238 

l6.  FROCRAM  ELEMENT.  PROJECT,  TASK 
AREA  A  WORK  UNIT  NUMBERS 

<  )■  CONTNOLLING  OFFICE  NAME  ANO  AOONEtS 

Office  of  Naval  P  aearch 

Department  of  the  Navy 

Arlington,  VA  22217 

IZ.  REPORT  date 

October  1987 

IS.  NUMBER  OP  PACES 

12A 

14.  MONITONING  agency  name  a  AOONEtVI'  dllinrnni  from  CanMltln$  Olllct: 

IS.  SECURITY  CLASS,  (ol  thio  rwporl) 

UNCLASSIFIED 

ISa.  DECLASSIFICATION  DOWN  GRADING 
SCHEDULE 

IS.  DItTNItuTION  statement  (el  Ible  Bepofl) 

Distribution  of  this  document  is  unlimited.  It  may  be  released  to 
the  Clearinghouse,  Department  of  Commerce,  for  sale  to  the  general 
public . 

17.  OIITAiauTION  STATEMENT  (ef  lh§  tbtirtel  tnlbrtd  In  Bleek  30,  It  illltitnl  (rain  Htpotl) 


If.  SUNNLEMENTAnY  NOTES 


IS.  KEY  WONOS  (Conllnut  ex  rtrcrt*  i/il*  II  n#c##»«y  mrO  lOnnllly  by  blecA  ni»ib»r; 

Artificial  Intelligence,  Natural  Language  Understanding,  Knowledge 
Representation,  Semantics,  Semantic  Networks,  KL-TWO,  NIKL,  Frame  Problem, 
Reasoning,  Reference,  Miscommunication,  Discourse ,  Planning,  Qualitative 
Physics,  Robotics,  Temporal  Reasoning^  , 

20.  AtSTRACT  (Contlny  on  rovoro*  oMa  II  nacaaaofy  md  Id^ntllr  by  block  nvmbor) 

BBN's  DAEPA  project  in  Knowledge  Representation  for  Natural  Language  Communi¬ 
cation  and  Planning  assistance  has  two  primary  objectives!  Al)  To  perform 
research  on  aspects  of  the  interaction  between  users  who  are  making  complex 
decisions  and  systems  that  are  assisting  them  with  their  task.  In  particular, 
this  research  is  focused  on  communication  and  the  reasoning  required  for 
performing  its  underlying  tasks  of  discourse  processing,  planning  and  plan 
recognition  and  communication  repair. ,^t2)  Based  on  the  -research  obiectivc.‘^. 
rnbinldtwilsforcommun  recognition,  and  planning  )  v 

DD  1JANM  W73  eciTioNor  iNovisisoisoLETE  Unclassified 

EECUNITY  CLASSIFICATION  OK  THIS  PACE  Dmim  Em, ,mj' 


Unclassified 


StCUWiTV  CUAMiriCATlOW  Of  Twit  ^AQC  fWin  Dl«  Bnfrm0 


assistance  and  for  the  representation  of  knowledge  and  reasoning  that  underl: 
all  of  these  processes. 


This  report  summarizes  BBN's  second  year's  activity  in  research  in  knowledge 
representation  and  natural  language.  In  particular,  the  report  discusses 
9^  &ar  work  in  the  areas  of  knowledge  representation,  planning,  and  discourse 
^  modeling.  We  describe-  formalisms  for  representing  knowledge  necessary  for 
the  planning  orocess.  These  include  the  representation  of  natural  events 
and  actions,  constraint  propagation  algorithms  for  temporal  reasoning,  and 
formalisms  foir  circumventing  the  frame  problem.  The  report  also  contains  a 
description  of  our  research  in  discourse  modelling  in  the  area  of  reference. 

VJe  describe  how  to  extend  the  reference  identification  component  of  a  natural 
language  svstem  to  handle  user's  inaccurate  descriptions  of  objects  in  the 
world  and  how  to  model  the  user's  use  of  pointing  gestures  to  refer  to  objects 
in  the  world.  We  also  document  publications  and  presentations  by  members  of 
the  research  group  ove^xthe  past  year. 


Unclassif ied 

ICCuniTy  CLAISIFICATION  of  this  face  (Wh»n  Da>*  Enitrtd) 


RESEARCH  IN  KNOWLEDGE  REPRESENTATION  FOR  NATURAL  UNGUAGE 
COMMUNICATION  AND  PUNNING  ASSISTANCE 


Annual  Report 

18  March  1986  -  31  March  1987 


Principal  Investigator; 
Dr.  Bradley  A.  Goodman 


Prepared  for; 


Defense  Advanced  Research  Projects  Agency 
1400  Wilson  Boulevard 
Arlington,  VA  22209 


ARFA  Order  No.  3414 


Contract  No.  N00014-B5-C-0079 


Effective  Date  of  Contract 
18  March  1985 


Contract  Expiration  Date 
17  March  1988 


Amount  of  Contract; 
$3,476,702 


Scientific  Officer 
Dr.  Alan  R  Meyrowitz 


Copy  I 


Accession  For 


NTIS  GRAil 
DTIC  TAB 
Unannounced 
Justlf loatlor 


Distribution/ 


Availability  Codes 
Avail  and/or 
Dlst  Special 


Copyright  (c)  1987  BBN  Laboratories  Incorporated 


Report  No.  6636 


BBN  Laboratories  Inc. 


This  research  was  supported  by  the  Advanced  Research  Projects  Agency  of  the 
Department  of  Defense  and  was  monitored  by  ONR  under  Contract  No.  N00014-85- 
C-00T9,  The  views  and  conclusions  contained  in  this  document  are  those  of  the 
authors  and  should  not  be  interpreted  as  necessarilv  representing  the  official  policies, 
either  expressed  or  implied,  of  the  Defense  Advanced  Research  Projects  Agency  or  the 


Report  No.  6636 


BBN  Laboratories  Inc. 


K 

rf 

ft 


TABLE  OF  CONTENTS 


1.  INTRODUCTION 


2.  CONSTRAINT  PROPAGATION  ALGORITHMS  FOR  TEMPORAL  REASONING 


2.1  Representing  Time 
2  2  The  Interval  Algebra 

2.3  Determining  Closure  in  the  Interval  Algebra 
2  4  Intractability  of  the  Interval  Algebra 
2  5  Consequences  of  Intractability 
2  6  A  Point  Temporal  Algebra 
2  7  Computing  Closure  in  the  Point  Algebra 
2  8  Relating  the  interval  and  point  algebras 
2.9  Consequences  of  These  Results 


PHYSICS  FOR  ROBOT' 


3  1  Introduction 

3  2  Composition  of  Materials 

3  3  Simple  Processes 

3  4  Robot  Perception  and  Action 

3  5  Filling  a  Pot  with  Water 

3  6  Conclusions 

3  7  Acknowledgements 


4.  THE  CASE  FOR  DOMAIN- SPECIFIC  FRAME  AXIOMS 


4.1  Introduction 

4  2  An  Argument  Against  the  Universal  Frame  Axiom 


4.3  Domain-Specific  Frame  Axioms 


3-7 

3-13 

3-17 

3-21 

3-22 


BBN  Laboratories  Inc. 


Report  No.  6636 


TABLE  OF  CONTENTS 


5.  A  COMPOSITIONAL  SEMANTICS  FOR  DIRECTIONAL  MODIFIERS  5-1 

5  1  Case-based  Treatments  5-1 

5.2  Motion  Verbs  as  Location  Predicates  5-3 

5.3  The  Semantics  of  to  and  toward  5-5 

5.4  The  Aspectual  Effect  of  to  and  toward  5-7 

5.5  Conclusion  5-9 


6.  REFERENCE  AND  REFERENCE  FAILURES 


6.1  Introduction  6-i 

6.2  Reference  5_2 

6.3  A  new  reference  paradigm  from  a  computational  viewpoint  6-3 

6.4  Summary  6_6 

6.5  Future  directions  6-8 


7.  POINTING  THE  WAY:  A  UNIFIED  TREATMENT  OF  REFERENTIAL  GESTURE  IN  7-1 
INTERACTIVE  DISCOURSE 


7  1  Introduction 

7.2  Semantic  Interpretation  of  Gesturally  Suppleted.  Verbally  Incomplete 
Verbally  Incomplete  Propositions 

7  3  A  Compositional  jyntax  and  Semantics  for  Sentences  in  Discourse 

7.4  The  Conversational  Function  of  the  Elliptical  Utterance 

7.5  The  LDM 

7.6  Structural  Relations  in  Discourse 

7.7  Discourse  Parsing  with  the  LDM 

7.8  Analyzing  the  Discourse  Context  of  B's  Utterance 

7.9  A  Gestural  Proposal 

7.10  Conclusion 


7-  1 
7-2 

7-4 

I  —  i 

7-8 
7-10 
7-10 
7-  1  1 
7-16 
7-18 


11 


Report  No.  6636 


TABLE  OF  CONTENTS 

8.  PUBUCATIONS 


9.  PRESENTATIONS 


BBN  Laboratories  Inc. 


8-1 


9-1 


B6N  Laboratories  Inc. 


Report  No.  6636 


SI 


tis 

■i 

i 

w 


i 

i 

S 


1M» 


a 


ism 


I 


i 

i 


Report  No.  6636 


BBN  Laboratories  Inc. 


LIST  OF  FIGURES 


Figure  2-1:  Simple  relations  in  the  interval  algebra  2-3 

Figure  2-2;  Examples  of  simple  relations  and  relation  vectors  2-4 

Figure  2-3:  Intervals  whose  relations  are  to  be  multiplied  2-5 

Figure  2-4:  The  constraint  propagation  algorithm  2-7 

Figure  2-5:  Simple  point  relations  2-11 

Figure  2-8:  Addition  and  multiplication  in  the  time  point  algebra  2-12 

Figure  2-7;  Translation  of  interval  algebra  to  point  algebra  2-16 

Figure  6-1;  Approaches  to  reference  identification  6-5 

Figure  6-2:  Reordering  referent  candidates  6-6 


B6N  Laboratories  loc. 


Report  No.  6636 


Report  No.  6636 


BBN  Laboratories  Inc. 


1.  INTRODUCTION 

This  project  is  eimed  at  developing  techniques  to  provide  computer  assistance  to 
a  decision  maker  for  understanding  and  reacting  to  a  complex  situation.  In  order  to 
provide  such  assistance,  a  system  must  be  able  to  understand  the  needs  of  the 
decision  makers  through  ihe  modes  of  interaction  in  which  the  decision  maker  chooses 
to  coirimuiiicate  Central  to  those  tasks  are  the  abilities  to  understand  and  produce 
utterances  with  complex  descriptions,  to  understand  in  the  face  of  communication 
errors,  and  to  model  the  discourse  to  connect  what  the  speaker  says  to  his  plans  and 
intentions  and  to  previous  requests.  Handling  a  decision  problem  in  a  methodical  way 
requires  that  planning  underlies  the  communication  task.  Our  research  proposes  new 
communication  techniques  and  algorithms  as  well  as  tools  for  representing  and 
manipulating  knowledge.  Two  principal  components  of  our  research  are  (1)  extensions 
of  knowledge  representation  systems  for  representing  time,  user  plans,  user  beliefs 
end  miscommunication  situations,  and  new  algorithm,  for  agents  that  plan  to  obtain 
new  knowledge  from  the  world  and  other  agents.  (2)  natural  language  systems  that 
understand  the  user's  intentions,  discourse  conventions  and  communicaticn  errors, 

The  research  plan  we  are  following  in  this  contract  is  aimed  toward  fundamental 
problems  in  Knowledge  Representation  and  Reasoning  relevant  to  Natural  Language 
Communication  and  Planning  Assistance  Central  to  this  plan  is  our  research  in  the 
representation  of  plans,  plan  recognition,  plan  formation,  reasoning  about  plans  and 
actions  and  modelling  the  discourse  The  exploration  of  these  research  topics 
requires  investigating  both  short-ter.i  as  well  as  long-term  solutions  We  are  also 
attempting  to  transfer  some  of  our  research  to  other  DARPA-supported  activity  at 
BBN. 

o  Natural  Language  Communication.  Central  to  extension  of  natural  language 


BBN  Laboratories  Inc. 


Report  No.  6636 


understanding  systems  from  handling  single  (isolated)  utterances  to  coherent 
dialogue  is  the  modeling  of  discourse.  Plans  that  represent  the  intentions 
of  the  speaker  and  listener  a/e  a  significant  component  of  such  a  model. 

The  representation  of  user  beliefs  is  an  important  constituent  of  these 
plans.  The  process  of  inferring  the  user’s  plans  from  the  user’s  utterances 
(plan  recognition!  is  the  key  to  understanding  dialogues.  To  fr-ther  our 
research  in  understanding  the  intentions  of  utterances  we  ar'  mg  ' the 
a.  ’’ty  to  understand  in  the  face  of  inisccmmunication. 

o  Planning  Assistance.  Planning  is  another  element  underlying  natural 
language  communication  A  major  component  of  our  research  is  the 
extension  of  knowledge  representation  svstems  for  representing  beliefs  of 
agents,  actions,  time,  continuous  processes,  partial  hypothetical  plans  and 
multiple  agents. 

Our  research  during  the  past  year  have  addressed  major  aspects  of  these 
problems  resulting  in  some  significant  results. 

0  We  have  developed  a  model  of  discourse  structure  including  attention  and 
intention.  This  theory  introduces  shared  plans  that  are  developed  by  two 
agents  during  an  extended  sequen-'e  of  utterances. 

0  We  have  implemented  a  plan  recognizer 

o  We  have  analyzed  and  cnders'  ood  miscommunication  phenomena,  surrounding 
use  of  noun  phrase  references,  in  actual  videotaped  dialogues.  We  are 
extending  our  theory  to  account  for  errors  in  recognizing  the  intentions 
underlying  speaker’s  utterances. 

0  We  have  demonstrated  a  reference  mechanism  based  on  relaxation  matching 
implemented  in  the  KL-Two  knowledge  representation  language. 

0  KL-Two  has  been  integrated  with  RUP  ("Reasoning  Utility  Package")  as  a 
reasoning  utility. 

0  We  ha”e  developed  a  new  representation  formalism  that  includes  fluids 
modeled  in  a  discrete  manner  based  on  a  notion  called  granules  and 
processes  that  are  continuous  are  modeled  discretely  m  a  manner  that 
permits  serial  as  well  as  concurrent  composition 

o  We  have  developed  a  logic  that  permits  representation  of  nested  beliefs  of 
several  agents,  allows  quantification  with  various  scoping,  and  has  efficient 
reasoning  based  on  first-order  unification 

o  As  an  initial  exercise  in  parallel  programming,  we  have  a  parallel  unification 
based  parser  for  a  grammar,  a  grammar  for  natural  language  with  excellent 
coverage,  a  running  program  on  a  Vax.  Symbolics,  and  a  parallel  version  on 
the  BBN  Butterfly. 


Report  No.  6636 


BBN  Laboratories  Inc, 


An  important  consequence  of  our  basic  research  is  its  transfer  to  on-going 
applications.  Consistent  with  that  view,  we  are  transferring  many  of  the  ideas, 
concepts  and  software  developed  m  this  project  to  other  projects  both  inside  and 
outside  BBN. 

0  We  have  developed  a  richly  expressive  intensional  logic  language  for 
capturing  the  semantics  of  natural  language  sentences,  including  modality, 
tense  and  context-dependence  This  has  now  been  transferred  to  the 
Strategic  Computation  natural  language  proj<=-ct  at  BBN  as  a  meaning 
representation  language  for  data  base  queries 

0  KL-Two  has  been  transferred  end  converted  from  InterLisp  to  CommonLisp. 

0  A  unification  based  parsing  algorithm  and  an  English  grammar  have  been 
transferred  to  the  Strategic  Computation  natural  language  projects  at  BBN. 

This  report  presents  some  of  our  research  results  in  these  areas  over  the  past  year. 

In  particular,  we  present  papers  in  the  areas  of  knowledge  representation  for 

planning,  semantics  and  discourse  modelling. 

Knowledge  representation  for  planning 

An  area  of  major  accomplishment  this  past  year  is  in  knowledge  representation. 
We  made  significant  progress  in  temporal  end  physical  reasoning.  The  paper  by  Vilain 
and  Kautz  in  this  volume  describes  constraint  propagation  algorithms  for  temporal 
reasoning.  In  that  research,  they  investigated  computational  aspects  of  several  time 
representations  and  have  shown  that  (1)  the  interval-based  representation  is  NP- 
complete.  (2)  the  point-based  representation  is  tractable,  and  (3)  a  subset  of  the 
interval-based  representation  can  be  given  a  point -based  representation  while 
preserving  tractability 

Another  paper  by  Schmolze  details  a  representation  formalism  for  modelling  many 
natural  events  and  actions.  In  that  research,  an  architecture  was  designed  for  a 
robot  planner  in  which  physical  knowledge  of  the  world  is  distinct  from  and  used  by 
the  knowledge  of  the  plan/search  mechanism  The  physical  knowled  ',e,  called  Physics 


BBN  Laboratories  Inc. 


Report  No.  6636 


for  Robots  (PFR),  covers  domains  not  heretofore  considered  actions  of  a  robot, 
natural  events  and  processes  that  are  controlled  by  a  robot,  and  objects  that  change 
over  time  PFR  departs  from  the  work  in  Naive  Physics  (NF’)  \2.  3]  in  two  ways.  (1) 
PFK  characterizes  the  robot's  capabilities  to  art  and  perceive,  and  (2)  PFR  replaces 
the  NF’  goal  of  developing  models  of  actual  common  sense  knowledge  Instead,  PFR 
includes  all  and  only  the  knowledge  that  robots  need  for  planning,  which  is  determined 
by  analyzing  proofs  showing  the  effectiveness  of  robot  1  '()  programs 

We  also  have  made  progress  in  the  area  of  planning,  The  most  fundamental 
problem  in  automated  plan  generation  is  the  frame  problem  [4]  Briefly  stated,  the 
frame  problem  consists  of  determining  those  aspects  of  the  world  that  are  unaffected 
by  the  performance  of  an  action  This  is  a  considerably  broader  class  of  properties 
than  those  which  are  affected  by  the  action.  Most  actions  typically  have  well- 
circumscribed  effects,  and  leave  much  of  the  world  unchanged.  Moving  a  box  from 
room  to  roo  .t,  for  example,  does  not  change  its  color,  that  of  the  rooms,  the  ambient 
temperature,  or  any  other  property  other  than  the  location  of  the  box 

General  solutions  to  the  frame  problem  have  unfortunately  been  very  elusive. 
Recent  efforts  at  formalizing  planning,  however,  have  turned  to  non-monotonic 
solutions  to  the  frame  problem.  Most  of  these  formalizations  are  centered  around  a 
general  non-monotonic  frame  axiom  This  axiom  usually  sanctions  the  inference  that 
a  proposition  persists  from  some  state  in  which  it  is  true  to  a  later  state  if  it  can  not 
be  proven  that  the  proposition  has  been  changed  by  an  intervening  action.  The 
advantages  of  this  approach  are  that  it  makes  unnecessary  the  multitudinous  frame 
axioms  of  the  early  formal  systems  However  the  inabilit}'  to  prove  that  something 
has  changed  may  be  due  to  the  incompleteness  of  the  database,  not  the  fact  that  it 
hasn't  changed.  In  these  cases,  applying  a  general  non-monotonic  frame  axiom  could 


yield  erroneous  conclusions 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


I 

I 

I 


y ' 


These  problems  can  be  avoided  by  a  careful  reconsideration  of  the  earlier 
monotonic  first-order  planning  formalisms.  Indeed,  it  has  generally  been  assumed  that 
a  monotonic  first-order  formalization  of  a  planning  domain  requires  enormous  amounts 
of  frame  axioms  to  state  all  the  properties  that  are  unchanged  by  an  action  (roughly 
one  axiom  per  action  property  pair)  In  fact,  all  that  is  needed  are  a  small  number  of 
axioms,  indexed  by  properties  of  the  domain.  These  axioms  simply  indicate  those 
actions  which  can  change  a  given  property,  all  others  leave  it  alone.  These  domain- 
specific  frame  axioms  replace  the  general  non-monotonic  one,  and  sanction  only 
monotonic  inferences,  thereby  avoiding  the  problems  that  arose  with  incomplete 
databases  in  the  non-monotonic  formalization.  This  research  is  reported  i;i  greater 
depth  in  the  paper  by  Haas  found  in  this  volume. 

Semantics 

This  past,  year  we  made  further  progress  in  semantics.  We  developed  a  richly 
expressive  logic  language  for  capturing  the  semantics  of  natural  language  sentences, 
including  modality,  tense  and  context-dependence.  That  research  has  b'*en 
transferred  to  the  Strategic  Computation  natural  language  projects  at  BBN  for  use  as 
a  meaning  representation  language.  One  aspect  of  semantics  addressed  m  that 
research  is  the  locative  case  for  prepositions  involving  direction  such  as  "to"  and 
"toward  "  In  the  paper  by  Hinrichs  found  in  this  report,  a  reformulation  of  the 
locative  case  for  prepositions  involving  direction  is  shown  This  reformulation 
improves  upon  the  approach  taken  in  case  frame  semantics  or  in  conceptual 
dependency  semantics  because  it  is  strongly  compositional  giving  it  a  significant 
computational  advantage  The  semantics  allows  prepositional  phrases  involving  "to"  to 
perform  in  the  same  way  with  verbs  of  agent -changing-location  such  as  "go"  and 
verbs  with  the  agent -stationary  such  as  "wave  "  In  previous  theories  each  class  of 
verbs  had  to  be  treated  separately  with  special  inference  rules.  In  our  semantics. 


1-5 


I 


BBN  Laboratories  Inc. 


Report  No,  6636 


I 

I 

I 

I 

I 

I 


I 

I 

i 

I 


i 

i 

I 


general  principles  of  the  representation  of  verb.'  stipulate  rules  that  will  apply  for 
each  subclass  of  verb. 

Discourse  modelling 

We  have  demonstrated  important  work  in  discourse  during  the  past  year  The 
work  by  .'^idner  on  a  model  of  disrour.se  is  reported  on  elsewhere  (see  [l])  We  report 
in  this  volume  the  work  bv  Goodman  in  reference  and  by  Hinnchs  and  Polanyi  in 
pointing  Both  contribute  to  further  elucidating  necessary  components  of  a  discourse 
model 


The  paper  by  Goodman  details  how  the  reference  component  of  a  natural 
language  system  must  be  expanded  to  handle  typical  ways  that  users  inaccurately 
refer  to  objects  in  the  world.  People  often  handle  such  poor  descriptions  routinely. 
Our  goal  in  this  work  was  to  extend  our  reference  mechanism  to  recognize  and  isolate 
such  mistakes  and  circumvent  them  In  the  paper  we  illustrate  a  framework  less 
restrictive  than  earlier  ones  We  claim  that  relaxation  is  an  integral  part  of  that 
framework,  providing  a  process  for  repairing  a  speaker's  descriptions.  Our  theory 
incorporates  the  same  language  and  physical  knowledge  that  people  use  in  performing 
reference  identification  to  guide  the  relaxation  process.  This  knowledge  is 
represented  as  a  set  of  rules  and  as  data  in  a  hierarchical  knowledge  base.  Rule- 
based  relaxation  provides  a  methodical  way  to  use  knowledge  about  language  and  the 
world  to  find  a  referent  The  hierarchical  representation  made  it  possible  to  tackle 
issues  of  imprecision  and  over-specification  in  a  speaker  s  description  It  allows  one 
to  check  the  position  of  a  description  in  a  hierarchy  and  to  use  that  position  to  judge 
imprecision  and  over-specification  and  to  suggest  possible  repairs  to  the  description. 

Pointing  provides  another  means  of  referring  in  a  discourse.  In  a  situation 
where  natural  language  and  pointing  facilities  are  combined  to  make  an  interactive 


1-6 


WTtkTc  k-nifRw«»a  k  r  .-.n  .  ana  «  r  aaaa  hxkmilm  rj  rj  nj  nji .aaa.  kji  .-,j,  a 


Report  No.  6636 


BBN  Laboratories  Inc. 


system,  a  unified  treatment  of  grammar,  discourse  model  ana  gesture  is  useful.  Such  a 
unified  treatment  is  described  in  Polanyi  [5]  W’e  have  demonstrated  that  in  order  to 
account  for  the  contextual  relevance  of  linguistic  units  such  as  words,  phrases. 


sentences  as  well  as  pointings,  an  adequate  model  must  include  (1)  a  compositional 
syntax  and  semantics  capable  of  dealing  with  fragmentary  injiut  and  (2)  a  nesied 


discourse  structure  that  assigns  a  suitalile  interpretation  context  to  each  structure 
processed  The  pajier  bv  Hinrichs  and  Folanyi  describes  the  incorporation  of 
referential  gesture  as  part  of  their  model  of  discourse.  Pointing  gestures  are  an 
important  part  of  the  communication  process  because  they  provide  a  concise,  though 
vague,  method  of  indicating  to  other  conversational  participants  the  intended  objects 
of  reference  Their  use  simplifies  the  language  of  referring  expressions  and  provides 
further  evidence  to  listeners  that  they  have  found  the  correct  referent. 


BBN  Laboratories  Inc. 


Report  No.  6636 


References 

1.  Gros:;.  B.J  and  Sidnt.-,  '^.L.  "Attention  Intentions,  and  the  Structure  of  Discourse". 
Computational  Linginstics  3  (1986). 

2.  Hayes.  P.  The  Serond  Naive  Physics  Manifesto  In  Formal  Theory  s  of  the 
Commonserisc  World. 

Able;;.  1985.  pp.  1-36 

3.  Hayes.  P  Naive  Physics  1  Ontology  for  Liquids  In  Formal  Thcoric;  of  the 
Commonsense  World 

Able.v.  1985.  pp  71-108 

4.  McCarthy,  J,.  and  Hayes.  P,  J.  Some  Philosophical  Problems  from  the  Standpoint  of 
Artificial  Intelligence,  In  Machine  intclhgence  4.  B.  Meltzer  '&  D  Michie,  Eds,,  American 
Elsevier.  New  York,  1969. 

5.  Polanyi.  L,  The  Linguistic  Discourse  Model.  Towards  a  Formal  Theory  of  Discourse 
Structure.  Technical  Report  6409,  BBN  Laboratories  Inc  ,  Cambridge,  MA,  1986. 


Report  No.  6636 


BBN  Laboratories  Inc. 


2.  CONSTRAINT  PROPAGATION  ALGORITHMS  FOR  TEMPORAL  REASONING 


Marc  Vilain,  Henry  Kautz 


Abstract:  This  paper  considers  computational  aspects  of  several  temporal 
representation  languages.  It  investigates  an  iittcrval-  based  representation, 
and  a  poinl-basid  one  Computing  the  consegucnces  of  temporal  assertions  s 
shown  to  be  computationally  intractable  in  the  interval -based  representation, 
but  not  in  the  point-based  one.  However,  a  fragment  of  the  interval 
language  can  be  expressed  using  the  point  language  and  benefits  from  the 
tractability  of  the  latter. 


2.1  Representing  Time 


The  representation  of  time  has  been  a  recurring  concern  of  Artificial  Intelligence 
researchers.  Many  representation  schemes  have  been  proposed  for  temporal 
reasoning,  of  these,  one  of  the  most  attractive  is  James  Allen's  algebra  of  temporal 
intervals  [Allen  83].  This  representation  scheme  is  particularly  appealing  for  its 
simplicity  and  for  its  case  of  implementation  with  constraint  propagation  algorithms. 


Reasoners  based  on  this  algebra  have  been  put  to  use  in  several  ways.  For 
example,  the  planning  system  of  Allen  and  Koomen  [1983]  relies  heavily  on  the 
temporal  algebra  to  perform  reasoning  about  the  ordering  of  actions.  Elegant 
approaches  such  as  this  one  may  be  compromised,  however,  by  computational 


characteristics  of  the  interval  algebra  This  paper  concerns  itself  with  these 


computational  aspects  of  Allen's  algebra,  and  of  a  simpler  algebra  of  time  points. 


Our  perspective  here  is  primarily  computation-theoretic  We  approach  the 
pr'^blem  of  temporal  representation  by  asking  questions  of  complexity  and  tractability 
In  this  light,  this  paper  examines  Allens  interval  algebra,  and  the  simpler  algebra  of 


time  points. 


The  bulk  of  the  paper  establishes  some  formal  results  about  the  temporal 
algebras.  In  brief  these  results  are 


. -.n  ^  X  *  « ».■» 


BBN  Laboratories  Inc. 


Report  No.  6636 


! 


m 


o  Determining  consistency  of  statements  in  the  interval  algebra  is  NP--hard.  as 
IS  determining  all  consequences  of  these  statements  Allen's  polynomial-time 
constraint  propagation  algorithm  is  sound  but  not  complete  for  these  tasks. 

o  In  contrast,  constraint  propagation  is  sound  and  complete  for  computing 
consistency  and  consequences  of  assertions  in  the  time  point  algebra.  It 
operates  in  O(n^)  time  and  O(n^)  space. 

o  A  restricted  form  of  the  interval  algebra  can  be  formulated  in  terms  of  the 
time  point  algebra.  Constraint  propagation  is  sound  and  complete  for  this 
fragment 

Throughout  the  paper,  we  consider  how  these  formal  results  affect  practical  Artificial 
Intelligence  programs. 


2.2  The  Interval  Algebra 

Allen's  interval  algebra  has  been  described  in  detail  in  [Allen  83]  In  brief,  the 
elements  of  the  algebra  are  relations  that  may  exist  between  intervals  of  time. 
Because  the  algebra  allows  for  indefiniteness  in  temporal  relations,  it  admits  many 
possible  relations  between  intervals  (2’^  in  fact)  But  all  of  these  relations  can  be 
expressed  as  vectors  of  definite  simple  relations,  of  which  there  are  only  thirteen  ’ 
The  thirteen  simple  relations,  whose  definitions  appear  in  Figure  2-1,  precisely 
characterize  the  relative  starting  and  ending  points  of  two  temporal  intervals.  If  the 
relation  between  two  intervals  is  completely  defined,  then  it  can  be  exactly  described 
with  a  simple  relation  Alternatively,  vectors  of  simple  relations  introduce 
indefiniteness  in  the  description  of  how  two  temporal  intervals  relate.  Vectors  are 
interpreted  as  the  disjunction  of  their  constituent  simple  relations. 

Two  examples  will  serve  to  clarify  these  distinctions  (please  refer  to  figure  2-2). 
Consider  the  simple  relations  BEFORE  and  AFTER  thev  hold  between  two  intervals 
that  strictly  follow  each  other,  without  overlapping  or  meeting  The  two  differ  by  the 
order  of  their  arguments  today  John  ate  his  breakfast  BEFORE  he  ate  his  lunch,  and 
he  ate  his  lunch  AFTER  he  ate  his  breakfast.  To  illustrate  relation  vectors  consider 


3 


C5 


& 


«3 


ti 


^In  foct,  these  thirteen  simple  relations  con  be  in  turn  expressed  in  terms  of  universolly 
.ond  ex i stent i 0 1  I y  quontified  expressions  involving  only  one  truly  primitive  relotion.  For 
details,  see  [Allen  &  Hoyes  85], 


2-2 


.V 


Report  No.  6636 


BBN  Laboratories  Inc. 


A  KFOW  a  I  AFTEH  A 

A  MEETS  a  aMravA 

ACVEHLAFSa  aOVEaLAFFEMY  A 

A  starts  a  aSTARTlMVA 

AOuRiNaa  aooMTAmsA 

A  ENOS  a  BENOEOBY  A 

AEOUALSa  BEQUAUA 


Figure  2-1:  Simple  relations  in  the  interval  algebra 

the  vector  (BEFORE  MEETS  OVERLAPS).  It  holds  between  two  intervals  whose  starting 
points  strictly  precede  each  other,  and  whose  ending  points  strictly  precede  each 
other.  The  relation  between  the  ending  point  of  the  first  interval  and  the  starting 
point  of  the  second  is  left  ambiguous  For  instance,  say  this  morning  John  started 
reading  the  paper  before  starting  breakfast,  and  he  finished  the  paper  before  his  last 
sip  of  coffee  If  we  didn't  know  whether  he  was  done  with  the  paper  before  starting 
his  coffee,  at  the  same  time  as  he  started  it.  or  after,  we  would  then  have 

PAPER  (BEFORE  MEETS  OVERLAPS)  COFFEE 

Returning  to  our  formal  discussion,  we  note  that  the  interval  algebra  is 
principally  defined  in  terms  of  vectors  Although  simple  relations  are  an  integral  part 
of  the  formalism,  they  figure  primarily  as  a  convenient  way  of  notating  vector 
relations  The  mathematical  operations  defined  over  the  algebra  are  given  in  terms  of 
vectors,  in  a  reasoner  built  on  the  temporal  algebra,  all  user  assertions  are  made  with 
vectors. 


BBN  Laboratories  Inc. 


Report  No.  6636 


SimoK  fHHoni:  BfMkiMt  BEFOflE  bJOCO 

Lunch  AFTER  bfeakltH 


SrMktMt  Lunch 

/ - /  / - / 


Bflisiian  VtflaL  p«pw(  before  meets  overlaps)  cokm 

CoOM 

/■ - •/ 

/. - ^ - - - . .y 

Urn  ?  Fipw  ?  Pipw  7 

Figure  2-2:  Examples  of  simple  relations  and  relation  vectors 


Two  operations,  an  addition  and  a  multiplication,  are  defined  over  vectors  in  the 
interval  algebra  Given  two  different  vectors  describing  the  relation  between  the  same 
pair  of  intervals,  the  addition  operation  "intersects  "  these  vectors  to  provide  the  least 
restrictive  relation  that  the  two  vectors  together  admit.  The  need  to  add  two  vectors 
arises  from  situations  where  one  has  several  independent  measures  of  the  relation  of 
I  two  intervals.  These  measures  are  combined  by  summing  the  relation  vectors  for  the 

I  measures  For  example,  say  the  relation  between  intervals  A  and  IS  has  been  derived 

I  by  two  valid  measures  as  being  both 

t  ,  =  {BEFORE  MEETS  OVERLAPS) 
t  o  =  (OVERLAPS  STARTS  DURING) 

To  find  the  relation  between  A  and  B.  that  is  implied  hy  I  ^  and  1.,.  the  two  vectors 
are  summed 

t',  +  t^  =  (OVERLAPS). 

Algorithmically,  the  sum  of  two  vectors  is  computed  by  finding  their  common 
constituent  simple  relations 

Multiplication  is  defined  between  pairs  of  vectors  that  relate  three  interval.^  .4.  B. 


2-4 


Mi!  WTltBKa’tiiUA'St.'W*  Ita  ViTU jW iTlXi! »."!  *.1  Xii  «J!  yj\  •*J\  AA  AA  AA,  .Af\ .>w! 


Report  No.  6636 


BBN  Laboratories  Inc. 


and  C.  More  precisely,  if  1’^  relates  intervals  A  and  B.  and  1'-,  relates  B  and  C.  the 
product  of  I'y  and  IV  s  the  least  restrictive  relation  between  A  and  C  that  is 
permitted  by  and  Consider,  for  example,  the  situation  in  Figure  2-3.  If  we 

have 

f,  ^  (BEFORE  MEETS  OVERLAPS) 
t  o  =  (BEFORE  MEETS) 

then  the  product  nf  I  ^  and  V„  is 

I  ,  a-  =  (BEFORE) 

As  with  addition,  the  multiplication  of  two  vectors  is  computed  by  inspecting  their 
constituent  simple  relations.  The  constituents  are  pairwise  multiplied  by  following  a 
simplified  multiplication  table,  and  the  results  are  combined  to  produce  the  product  of 
the  two  vectors  See  [Allen  83)  for  details 


R<A.B  '-  =  (BEFORE  MEETS  OVERLAPS  I 
R<B.C>  =  (BEFOR:  MEETS! 


R<A.C>  =  (BEFORE) 

Figure  2-3;  Intervals  whose  relations  are  to  be  multiplied 


BBN  Laboratories  Inc. 


Report  No  8636 


2.3  Determining  Closure  in  the  Interval  Algebra 


1*J 


In  actual  use,  Alien's  interval  algebra  is  used  to  reason  about  temporal 
information  in  a  specific  application  The  application  program  encodes  temporal 
information  in  terms  of  the  algebra,  and  asserts  this  information  in  the  database  of 
the  temporal  reasoner.  This  reasoner's  job  is  then  to  compute  those  temporal 
relations  which  follow  from  the  users  assertions.  We  refer  to  this  process  as 
completing  the  closure  of  the  user's  assertions 


Cii 


In  Allen's  model,  closure  is  computed  with  a  constraint  propagation  algorithm. 
The  operation  of  this  forward-chaining  algorithm  is  driven  by  a  queue.  Every  time  the 

relation  between  two  intervals  A  and  B  is  changed,  the  pair  <A.  B>  is  placed  on  the 

queue  The  algorithm,  shown  in  Figure  2-4  operates  by  removing  pairs  from  the 

queue  For  every  pair  <4,  B''  that  it  removes,  the  algorithm  determines  whether  the 

relation  between  .4  and  B  can  be  used  to  constrain  the  relation  between  A  and  other 

intervals  in  the  database,  or  between  B  and  these  other  intervals.  If  a  new  relation 
can  be  successfully  constrained,  then  the  pair  of  intervals  that  it  relates  is  in  turn 
placed  on  the  queue.  The  process  terminates  when  no  mere  relations  can  be 
constrained. 

As  Allen  suggests  [Allen  83],  this  constraint  propagation  algorithm  runs  to 
completion  in  time  polynomial  with  the  number  of  intervals  in  the  temporal  database. 

A 

He  provides  an  estimate  of  0(n  )  calls  to  the  Propagate  procedure.  A  more  fine¬ 
grained  analysis  reveals  that  when  the  alg''nthm  runs  to  completion,  it  will  have 

performed  O(n^)  multiplications  and  additions  of  temporal  relation  vectors. 

Theorem  1:  Let  /  be  a  set  of  ii  intervals  about  which  m  assertions  have 

been  added  with  the  Add  procedure  When  invoked,  the  Close  procedure  will 

run  to  completion  in  0(n'^)  time 

Proof:  (Sketch^)  A  pair  of  intervals  i.?.-  is  entered  on  Quevc  when  its 
relation,  stored  in  7'ab/c[i.7],  is  non -trivially  updated  It  is  easy  to  show  that 
no  more  than  O(n^)  pairs  of  intervals  <1.7  are  ever  entered  onto  the  queue 
This  IS  because  there  are  only  O(n^)  relations  possible  between  the  n 


2 

Most  of  the  theorems  in  this  poper  hove  rother  long  proofs.  For  this  reoson,  we  hove 
restricted  ourselves  here  to  providing  only  proof  sketches. 

2-8 


“1 


1 

s 


t 


j 


Repor .  No.  6636 


BBN  Laboratories  Inc. 


I 


I 

I 

I 

I 

I 

I 

I 


m 


/*  Let  Table  be  a  two-dimensional  array,  indexed  by  intervals,  in  which  Tab!e[iJI  holds  the 
relation  between  intervals  i  and  j.  Table[i,jJ  is  initialized  to  (BEFORE  MEETS  ...  AFTER), 
the  additive  identity  vector  consisting  of  all  thirteen  simple  relations;  except  for  Table[i,i] 
which  is  initialized  to  (EQUAL).  Let  Queue  be  a  FIFO  data  structure  that  will  keep  track  of 
those  pairs  of  intervals  whose  relation  has  been  changed.  Let  Intervals  be  a  list  of  aU  intervals 
about  which  assertions  have  been  made.  */ 


To  \ii{R<ij>) 

'*  R<i.j>  is  a  relation  being  asserted  between  i  and/*/ 
begin 

Old  *-  TableliJ}; 

Table[iJJ  f-  Table{i,jJ +R<i.j>; 
it  TableliJ]  *  Old 

then  Place  <ij>  on  Fifo  Queue: 

Intervals  <-  Internals  <J  |i,yi: 
end; 


To  Close 

/•  Computes  the  closure  of  assertions  added  to  the  database.  */ 

While  Queue  is  not  empty  do 
begin 

Get  next  <ij>  from  Queue: 

Propagate(j\;); 

end; 


To  Propagate!/ J) 

/*  Called  to  propagate  the  change  to  the  relation  between 
intervals  /  and  /  to  all  other  inter'als.  */ 

For  each  interval  K  in  Inten’als  do 
begin 

Temp  <-  Table[lJC]  +  (TablelU]  x  TablelJX}): 
it  Temp  =  0 

then  {signal  contradiction); 
it  TablelLK]  *  Temp 

then  Place  <I.K>  on  Queue: 

TablelLK]  e-  Temp: 

Temp  <-  TablelKJ]  +  (TablelKJJ  x  TablelU]): 
it  Temp  =  0 

then  (signal  contradiction); 
it  TablelKJ]  *  Temp 

then  Place  <KJ>  on  Queue: 

TablelKJ]  <-  Temp; 
end; 

Figure  2-4:  The  constraint  propagation  algorithm 


g 

.? 


s 


n 

i 

i 


2-7 


» 

) 


BBN  Laboratories  Inc, 


Report  No.  6636 


intervals,  and  because  each  relation  can  only  be  non-trivially  updated  a 
constant  number  of  times. 


iCs 


CJ9 

J 

UM 


Further,  every  time  a  pair  <i.;  is  removed  from  Queue,  the  algorithm 
performs  0(n)  vector  additions  and  multiplications  (in  the  body  of  the 
Propagate  procedure)  Hence  the  time  complexity  of  the  algorithm  is 
Otn  ■  n^)  =  O(n^)  vector  operations 

The  vector  operations  can  be  considered  here  to  take  constant  time.  By 
encoding  vector?  as  bit  strings,  addition  can  be  performed  ivith  a  13-bit  integer 
operation  For  multiplication,  the  complexity  is  actually  0(|V^|  IVjll,  where  |V^|  and 
iVjl  are  the  "lengths"  of  the  two  vectors  to  be  multiplied  (i.e..  the  number  of  simple 
constituents  in  each  vector)  Since  vectors  contain  at  most  13  simple  constituents, 
the  complexity  of  multiplication  is  bounded,  and  the  idealization  of  multiplication  as 
operating  in  constant  time  is  acceptable. 

Note  that  the  polynomial  time  characterization  of  the  constraint  propagation 
algorithm  of  Figure  2-4  is  somewhat  misleading.  Indeed,  Allen  [1983]  demonstrates 
that  the  algorithm  is  sound,  in  the  sense  that  it  never  infers  an  invalid  consequence 
of  a  set  of  assertions.  However,  Allen  also  shows  that  the  algorithm  is  incomplete  he 
produces  an  example  in  which  the  algorithm  does  not  make  all  the  inferences  that 
follow  from  a  set  of  assertions.  He  suggests  that  computing  the  closure  of  a  set  of 
temporal  assertions  might  only  be  possible  in  exponential  time  Regrettably,  this 
appears  to  be  the  case.  As  we  demonstrate  in  the  following  paragraphs,  computing 
closure  in  the  interval  algebra  is  an  NP-hard  problem. 


2.4  Intractability  of  the  Interval  Algebra 


To  demonstrate  that  computing  the  closure  of  assertions  is  NP-hard.  we  first 
show  that  determining  the  consistency  (or  satisfiability)  of  a  set  of  assertions  is  NP- 
hard  We  then  show  that  the  consistency  and  closure  problems  are  equivalent. 

Theorem  2:  Determining  the  satisfiability  of  a  set  of  assertions  in  the 
interval  algebra  is  NP-hard 

Proof:  (Sketch)  This  theorem  can  be  proven  by  reducing  the  3-cTause 
satisfiability  problem  (or  3-SAT)  to  the  problem  of  determining  satisfiability  of 
assertions  in  the  interval  algebra  To  do  this,  we  construct  a 


L '*■* 


2-8 


::5 


Report  No.  6636 


BBN  Laboratories  Inc. 


(computationally  trivial)  mapping  between  a  formula  in  3-SAT  form  and  an 
equivalent  encoding  of  the  formula  in  the  interval  algebra. 

Briefly,  this  is  done  by  creating  for  each  term  P  in  the  formula,  and  its 
negation  ~P,  a  pair  of  intervals,  P  and  NOTP,  These  intervals  are  then 
related  to  a  "truth  determining"  interval  MIDDLE,  intervals  that  fall  before 
MIDDLE  correspond  to  false  terms,  and  those  that  fall  after  MIDDLE 
correspond  to  frue  terms.  The  original  formula  is  then  encoded  into 
assertions  in  the  algebra,  this  can  be  done  (deterministically)  in  polvnomial 
time. 

The  encoding  proceeds  clause  by  clause  For  each  clause  P  V  Q  V  R. 
special  intervals  are  created  These  intervals  are  related  to  the  literals' 
intervals  P.  Q,  and  R  in  such  a  way  taat  at  most  two  of  these  intervals  can 
be  before  MIDDLE  (which  makes  them  false).  The  other  for  others)  can  fall 
after  MIDDLE  (which  makes  them  true). 

It  can  then  be  shown  that  the  original  formula  has  a  model  just  in  case 
the  interval  encoding  has  one  too.  Satisfiability  of  a  3-SAT  formula  could 
thu.^  be  established  by  determining  the  satisfiability  of  the  corresponding 
interval  algebra  assertions  Since  the  former  problem  is  NP-complete,  the 
latter  one  must  be  (at  least)  NP-hard 

The  following  theorem  extends  the  NP-hard  result  for  the  problem  of  determining 
satisfiability  of  assertions  in  the  interval  algebra  to  the  problem  of  determining 
closure  of  these  assertion^ 

Theorem  3;  The  problems  of  determining  the  satisfiability  of  assertions 
in  the  interval  algebra  and  determining  their  closure  are  equivalent,  in  that 
there  are  polynomial  time-mappings  between  them. 

Proof:  (Sketch)  First  we  show  that  determining  closure  follows  readily 
from  determining  consistency.  To  do  so,  assume  the  existence  of  an  oracle 
for  determining  the  con.sistency  of  a  set  of  assertions  in  the  interval  algebra 
To  determine  the  closure  of  the  assertions,  we  run  the  oracle  thirteen  times 
for  each  of  the  0(n‘)  pairs  -.i.j:-  of  intervals  mentioned  in  the  assertions. 
Specifically,  each  time  we  run  the  oracle  on  a  pair  ‘.1.7  we  provide  the 
oracle  with  the  original  set  of  assertions  and  the  additional  assertion  1  (R>  ] 
where  R  is  one  of  the  thirteen  simple  relations  The  relation  vector  that 
holds  between  i  and  7  is  the  one  containing  those  simple  relations  that  the 
oracle  didn't  reject. 

To  show  that  determining  consistency  follows  from  determining  closure, 
assume  the  existence  of  a  closure  algorithm  To  see  if  a  set  of  assertions  is 
consistent,  run  the  algorithm,  and  inspec  t  "  ach  of  the  O(n^)  relations  between 
the  n  intervals  mentioned  in  the  assertions.  The  database  is  inconsistent  if 
any  of  these  relations  is  the  inconsistent  vector,  this  is  the  vector  composed 
of  no  constituent  simple  relations 


BBN  Laboratories  Inc. 


Report  No.  6636 


£ 


The  two  preceding  theorems  demonstrate  that  computing  the  closure  of 
assertions  in  the  interval  algebra  is  NP-hard.  This  result  casts  great  doubts  on  the 
computational  tractability  of  the  algebra,  as  no  NP-hard  problem  is  known  to  be 
solvable  in  less  than  exponential  time. 

2.5  Consequences  of  Intractability 

Several  authors  have  described  exponential-time  algorithms  that  compute  the 
closure  of  assertions  in  the  interval  algebra,  or  some  subset  thereof  Valdbs-Pbrez 
[1986]  proposes  a  heuristically  pruned  algorithm  which  is  sound  and  complete  for  the 
full  algebra.  The  algorithm  is  based  on  analysis  of  set-theoretic  constructicns.  Malik 
&  Binford  [1983]  can  determine  closure  for  a  fraction  of  the  interval  algebra  with  the 
exponential  Simplex  algorithm.  As  we  shall  show  below,  their  method  is  actually  more 
powerful  than  need  be  for  the  fragment  that  they  consider 

Even  though  the  interval  algebra  is  intractable,  it  isn’t  necessarily  useless 
Indeed,  it  is  almost  a  truism  of  Artificial  Intelligence  that  all  interesting  problems  are 
computationally  at  least  NP-hard  (or  worse)!  There  are  several  strategies  that  can  be 
adopted  to  put  the  algebra  to  work  in  practical  systems 

The  first  is  to  limit  oneself  to  small  databases,  containing  on  the  order  of  a 
dozen  intervals.  With  a  small  database,  the  asymptotically  exponential  performance  of 
a  complete  temporal  reasoner  need  not  be  noticeablv  poor.  This  is  in  fact  the 
approach  taken  by  Malik  and  Binford  to  manage  the  exponential  performance  of  their 
5imp/cx- based  system.  Unfortunately,  it  can  be  very  difficult  to  restrict  oneself  to 
small  databases,  since  clustering  information  in  this  way  necessarily  prevents  all  but 
the  simplest  interrelations  of  intervals  in  separate  databases 


Another  strategy  is  to  stick  to  the  polynomial-time  constraint  propagation 
closure  algorithm,  and  accept  its  incompleteness  This  is  acceptable  for  applications 
which  use  a  temporal  database  to  notate  the  relations  between  events,  but  don’t 
particularly  require  much  inference  from  the  temporal  reasoner,  For  applications 
which  make  heavy  use  of  temporal  reasoning,  however,  this  may  not  be  an  option 


g 


2-10 


ykTAiiiej  ■i.rR.nxjr.ci  >iji  xa  it",  ki 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 

ii 

I 

I 

I 


Finally,  an  alternative  approach  is  to  choose  a  temporal  representation  other 
than  the  full  interval  algebra.  This  can  be  either  a  fragment  of  the  algebra,  or 
another  representation  altogether.  We  pursue  this  option  below, 


2.6  A  Point  Temporal  Algebra 

An  alternative  to  reasoning  about  intervals  of  time  is  to  reason  about  points  of 
time.  Indeed,  en  algebra  of  time  points  can  be  defined  in  much  the  same  way  as  was 
the  algebra  of  time  intervals.  As  with  intervals,  points  are  related  to  each  other 
through  relation  vectors  which  are  composed  of  simple  point  relations.  These  primitive 
relations  are  defined  in  Figure  2-5. 


A  I 

A  AMCEOCS  I  •  • 


A  SAMI  I  AC 

■  • 


AACXiOMI 


■  A 

•  • 


Figure  2-5:  Simple  point  relations 


i 


tJ 


As  with  the  interval  algebra,  the  point  temporal  algebra  possesses  addition  and 
multiplication  operations  These  operations,  whose  tables  are  given  in  Figure  2-6, 
mirror  the  operations  in  the  interval  algebra  Addition  is  used  to  combine  two 
different  measures  of  the  relation  of  two  points  Multiplication  is  used  to  determine 
the  relation  between  two  points  A  and  B,  given  the  relations  between  each  of  A  and  B 
and  some  intermediate  point  C  These  operations  both  have  constant-time 
implementations  if  the  relation  vectors  between  time  points  are  encoded  as  bit  strings. 
With  this  encoding  both  operations  can  be  performed  by  simple  lookups  in  two- 
dimensional  (8  X  8l  arrays.  Alternatively,  addition  can  be  performed  with  an  even 
simpler  3-bit  logical  AND  operation. 


m 

I 

m 


2-11 


BBN  Laboratories  Inc. 


Report  No.  6636 


+  I  <  <■  >  >■  ■  -■  ?  I 

— + — + - 4 - +- — + + +- — + 

<I<I<IOI0I0I<I<I 


<-|  <  I  <-|  0  I  -  I  -  I  <1  <-| 

+ — + — 4 — 4 — + — + — + — + 
>|0|0|>|>|0|>|>| 


>•1  0  I  -  I  >  I  »| 

+— + — + — 

-  I  0  I  -  I  0  I  -  I 


I  >  I  >-l 
— + 
I  0  I  -  I 
-+ — + — + 


“■I  <  I  <  I  >  I  >  I  0  I 


< 

1 

<-i 

> 

1 

>-i 

-  1 

«*■ 

'1 

? 

1 

mmm 

■.« 

...+. 

...+. 

■  .« 

+- 

•+ 

< 

<■ 

> 

>. 

a 

««■ 

? 

1 

— 

... 

+- 

... 

■+ 

< 

1 

<  1 

? 

1 

?  1 

<  1 

7 

1 

7 

1 

— 

•+• 

... 

■+• 

.— +. 

... 

■+• 

... 

■+ 

< 

1 

<■1 

? 

1 

?  1 

<-| 

7 

1 

? 

1 

... 

•+• 

...+. 

...+. 

... 

•+• 

... 

■+ 

1 

?  1 

> 

1 

>  1 

>  1 

? 

1 

? 

1 

-+• 

...+. 

... 

-+• 

...+. 

...+. 

... 

•+• 

... 

■+ 

? 

1 

?  1 

> 

1 

>-i 

>-| 

7 

1 

7 

1 

... 

-+• 

... 

-+< 

...+. 

...+. 

... 

... 

■+ 

< 

1 

<-i 

> 

1 

>-i 

-  1 

— 1 

7 

1 

... 

-+< 

...+. 

... 

-+• 

...+• 

... 

... 

-+ 

7 

1 

?  1 

7 

1 

?  1 

—  1 

7 

1 

7 

1 

... 

-+ 

... 

-+• 

...+. 

...+. 

... 

... 

-+ 

7 

1 

?  1 

7 

1 

?  1 

?  1 

7 

1 

7 

1 

... 

-+ 

... 

-+• 

— .+• 

... 

... 

-+ 

Key  to  symbols; 


0  ,  the  null  vector 
(PRECEDES) 

(PItECEDES  SAMS) 
(rOLLOHS) 

(SAME  rOLLONS) 

(SAME) 

(PItECEDES  FOLLOWS) 
(PRECEDES  SAME  FOLLOWS) 


Figure  2-6;  Addition  end  multipticetion  in  the  time  point  algebra 


2-12 


Report  No.  6636 


BBN  Laboratories  Inc. 


2.7 


Computing  Closure  in  the  Point  Algebra 


I 

i 

I 

I 

I 


As  was  the  case  with  intervals,  determining  the  closure  of  assertions  in  the  point 
algebra  is  an  important  operation  Fortunately,  the  point  algebra  is  sufficiently  simple 
that  closure  can  be  computed  in  polynomial  time  To  do  so,  we  can  directly  adapt  the 
constraint  propagation  algorithm  of  Figure  2-4  Simply  replace  the  interval  vector 
addition  and  multiplication  operations  with  point  additions  and  multiplications,  and  run 
the  algorithm  with  point  assertions  instead  of  interval  assertions. 

As  before,  the  algorithm  runs  to  completion  in  O(n^)  time,  where  it  is  the  number 
of  points  about  which  assertions  have  been  made  As  with  the  interval  algebra,  the 
algorithm  is  sound,  any  relation  that  :l  infers  between  two  points  follows  from  the 
user's  assertions.  This  time,  however,  the  algorithm  is  complete.  When  it  terminates, 
the  closure  of  the  point  assertions  will  have  been  correctly  computed 

We  prove  completeness  by  referring  to  the  model  theory  of  the  time  point 
algebra  In  essence,  we  consider  any  database  over  which  the  algorithm  has  been  run. 
and  construct  a  model  for  any  possible  interpretation  of  the  database  If  the 
database  is  indefinite,  a  model  must  be  constructed  for  each  possible  resolution  of  the 
indefinit'^ness.^ 

We  choose  the  real  numbers  to  model  lime  points.  A  model  of  a  database  of  time 
points  IS  simply  a  mapping  between  those  time  points  and  some  corresponding  real 
numbers  The  relations  between  time  points  are  mapped  to  relations  between  real 
numbers  in  the  obvious  way.  For  example,  if  time  point  A  precedes  time  point  B  in  the 
database,  then  4  s  corresponding  number  is  less  than  B's 

Theorem  A:  The  constraint  propagation  algorithm  is  complete  for  the 
time  point  algebra  That  is.  a  model  ran  be  constructed  for  any 
interpretation  of  the  processed  database. 

Proof:  (Sketch)  We  first  note  that  the  algorithm  partitions  the  database 


^This  demonst rotes  completeness  in  the  following  sense.  If  there  were  on  i  nterpretot i on 
of  the  processed  dotobose  for  which  no  model  could  be  constructed,  the  olgorithm  would  be 
incomplete.  It  would  hove  foiled  to  eliminote  o  possible  interpretot ion  prohibited  by  the 
originol  ossertions. 


2-13 


BBN  Laboratories  Inc. 


Report  No.  6636 


into  one  or  more  partial  order  graphs.  After  the  algorithm  is  run,  each  node 
in  a  graph  corresponds  to  a  cluster  of  points.  These  are  all  points  related 
to  by  the  vector  (SAME),  note  that  the  algorithm  computes  the  transitive 
closure  of  (SAME)  assertions.  Arcs  in  the  graph  either  indicate  precedence 
(the  vectors  (PRECEDES)  or  (PRECEDES  SAME),  or  their  inverses)  or 
disequality  (the  vector  (PRECEDES  FOLLOWS))  At  the  tiottom  of  each  graph 
IS  one  or  more  "bottom”  nodes  nodes  which  are  preceded  by  no  other  node 

Further,  when  the  algorithm  has  run  to  completion  the  graphs  ere  all 
consistent,  in  the  following  two  senses  First,  all  points  are  linearly  ordered, 
th  .-re  is  no  path  from  any  point  in  a  graph  hack  to  itself  that  solely 
tieverses  precedence  arcs  (time  doesn't  curve  back  on  itself)  Second,  no 
two  points  that  are  in  the  same  cluster  were  asserted  to  be  disequal  with  the 
(PRECEDES  FOLLOWS)  vector.  If  the  user  had  added  any  assertions  that 
contradicted  these  consistency  criteria,  the  algorithm  would  have  signalled 
the  contradiction. 

Note  that  all  of  the  preceding  properties  can  be  shown  with  simple 
inductive  proofs  by  considering  the  algorithm  and  the  addition  and 
multiplication  tables 

The  model  construction  proceeds  by  picking  a  cluster  of  points  (i.e.,  a 
node)  at  the  "bottom"  of  some  graph  and  assigning  all  of  its  constituent 
points  to  some  real  number.  The  cluster  is  then  removed  from  the  graph,  and 
the  process  proceeds  on  with  another  real  number  (greater  than  the  first) 
and  another  cluster  (either  in  the  same  graph  or  in  another  one).  The 
process  is  complicated  somewhat  because  some  clusters  may  be  "equal"  to 
other  clusters  (their  constituent  points  may  be  related  by  some  vector 
containing  the  SAME  relation)  For  these  cases  it  is  possible  to  "collapse" 
several  (zero.  one.  or  more)  of  these  clusters  together,  and  assign  their 
constituent  points  to  the  same  real  number.  Some  other  clusters  may  be 
"disequal"  For  these,  we  must  just  make  sure  never  to  "collapse"  them 
together  Because  the  choice  of  which  "bottom”  node  to  remove  and  which 
clusters  to  collapse  is  non-deterministic.  the  model  construction  covers  all 
possible  interpretations  of  the  database 


'  2.8  Relating  the  interval  and  point  algebras 

I 

The  tractability  of  the  point  algebra  makes  it  an  appealing  candidate  for 
representing  time.  Indeed,  many  problems  that  involve  temporal  sequencing  can  be 
i  formulated  in  terms  of  simple  points  of  time.  This  approach  is  taken  by  any  of  the 

planning  programs  that  are  based  on  the  situation  calculus,  the  patriarch  . of  these 
being  STRIPS  [Fikes  &  Nilsson  71  ] 


t'VJSSiTjfTty rjt.'JCf JfTiWVJ' ATiT-  Vj VJ iTL  <rj  ir.ic.  r.  ir.  ir.  rv  w  it  *  wv  r  . •( kJ>".  m-m  njnun 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


However,  as  many  have  pointed  out,  time  points  as  such  are  inadequate  for 
representing  many  real  phenomena  Single  time  points  by  themselves  aren’t  sufficient 
to  express  natural  language  semantics  (Allen  84].  and  they  are  very  inconvenient  (if 
not  useless)  for  modelling  many  natural  events  and  actions  (Schmol2e  86].  For  these 
tasks,  an  interval-based  time  representation  is  nece.ssary. 

Fortunately,  many  interval  relations  can  be  encoded  in  the  point  algebra.  This 
IS  accomplished  by  considering  intervals  as  defined  by  their  endpoints,  and  by 
encoding  the  relation  between  two  intervals  as  relations  between  their  endpoints.  For 
example,  the  interval  relation 

A  (DURING)  B 

can  be  encoded  as  several  point  assertions 

A+  (FOLLOWS)  A- 
B+  (FOLLOWS)  B- 
A-  (FOLLOWS)  B- 
B+  (FOLLOWS)  A  +  . 

where  A-  denotes  the  starting  endpoint  of  interval  A.  4+  denotes  its  finishing 
endpoint,  and  similarly  for  B. 

This  scheme  captures  all  unambiguous  relations  between  intervals,  that  is  all 
relations  that  can  be  expressed  using  vectors  that  contain  only  one  simple 
constituent  It  can  also  capture  many  ambiguous  relations,  but  not  all.  One  can 
represent  ambiguity  as  to  the  pairwise  relation  of  endpoints,  but  one  can  not 
represent  ambiguity  as  to  the  relation  of  whole  intervals.  The  vector 
(BEFORE  MEETS  OVERLAPS)  for  example  can  be  encoded  as  point  assertions,  but  the 
vector  (BEFORE  AFTER)  can  not  See  Figure  2-~ 

The  fragment  of  the  interval  algebra  that  can  be  translated  to  the  point  algebra 
benefits  from  all  the  computational  advantages  of  the  latter  In  particular,  the 
polynomial-time  constraint  propagation  algorithm  is  sound  and  complete  for  the 
fragment  This  is  the  interval  representation  method  that  Simmons  uses  in  his 
geological  reasoning  program  [Simmons  83.  and  personal  communication]. 

This  fragment  of  the  interval  algebra  is  also  the  one  used  by  Malik  and  Binford 


2-15 


BBN  Laboratories  Inc. 


Report  No.  6636 


INTERVAL 

POINT 

VECTOR 

TRANSLATION 

ILLUSTRATION 

A  (BEFORE  OVERLAPS  MEETS)  B 

A-  (PRECEDES)  B- 
A-  (PRECEDES)  A+ 
k¥  (PRECEDES)  B+ 

B-  (PRECEDES)  B+ 

y  A?  y  “ 

A  (BEFORE  AFTER)  B 

No  equivalent 

B 

point  form 

Figure  2-7:  Translation  of  interval  algebra  to  point  algebra 


[1983]  in  their  spacio-temporal  reasoning  program,  In  their  case,  though,  reasoning 
IS  performed  with  the  exponential  Simplex  algorithm.  This  use  of  the  general  Simplex 
procedure  is  not  strictly  necessary,  though,  since  the  problem  could  be  solved  by  the 
considerably  cheaper  constraint  propagation  algorithm, 

Although  many  applications  may  be  able  to  restrict  their  interval  temporal 
reasoning  to  the  tractable  fragment  of  the  interval  algebra,  some  applications  may  not. 
One  program  that  requires  the  full  interval  algebra  is  the  planning  system  of  Allen  and 
Koomen  [1983]  that  we  referred  to  above  In  this  system,  several  actions  can  occur 
simultanesously.  and  must  consequently  be  modeled  with  intervals.  For  example,  to 
declare  that  two  actions  are  non-overlapping,  one  asserts 

ACT,  (BEFORE  MEETS  MET-B)  AFTER)  ACTj 

As  we  just  showed,  this  kind  of  assertion  falls  outside  of  the  tractable  fragment  of 
the  interval  algebra  In  a  planner  with  this  architecture,  this  representation  problem 
can  be  dealt  with  either  by  invoking  an  exponential  temporal  reasoner.  or  by  bringing 
to  bear  planning-specific  knowledge  about  the  ordering  of  actions 


2-16 


a  K  gcTw?  I  K*  ES®  SKS  S  !  RSI  riSS  1 


Report  No.  6636 


BBN  Laboratories  Inc, 


2.9  Consequences  of  These  Results 

Increasingly,  the  tools  of  knowledge  representation  are  being  put  to  use  in 
practical  system.?  For  tlie.<=e  systems,  it  is  often  crucial  that  the  representation 
components  be  computationally  efficient.  This  has  prompted  the  Artificial  Intelligence 
community  to  start  taking  seriously  the  performance  of  Al  algorithms.  The  present 
paper,  by  considering  critically  the  computational  characteristics  of  several  temporal 
representations,  follows  this  recent  trend 

What  lessons  may  we  learn  from  analyses  such  as  this?  Of  immediate  benefit  is 
an  understanding  of  the  computational  advantages  and  disadvantages  of  different 
representation  languages.  This  permits  informed  decisions  as  to  how  the 
representation  components  of  application  systems  should  be  structured.  We  can  better 
understand  when  to  use  the  power  of  general  representations,  and  when  to  set  these 
general  tools  aside  in  favor  of  more  application  — specific  reasoners, 

A  close  scrutiny  of  the  ongoing  achievements  of  Artificial  Intelligence  enables  a 
better  understanding  of  the  nature  of  Al  methods.  This  process  is  crucial  for  the 
maturation  of  our  field 


References 

[Allen  83]  Allen,  J.  F. 

Maintaining  Knowledge  About  Temporal  Intervals 
Commynicafions  of  the  ACM  26(11)832-843,  November,  1983. 

[Allen  84]  Allen,  J.  F. 

Towards  a  General  Theory  of  Action  and  Time 
Artificial  Intelligence  23(2)  123-154.  1984 

[Allen  f(  Hayes  35] 

Allen.  J.  F  and  Hayes.  P  J 
A  Common-Sense  Theory  of  Time 

In  Proceedings  of  the  Ninth  Jnterriational  Joint  Conference  on 

Artificial  Intelligence ,  pages  528-531.  The  International  Joint 
Conference  on  Artificial  Intelligence  (IJCAl).  Los  Angeles.  CA, 
August.  1985. 


BBN  Laboratories  loc. 


Report  No.  6636 


lAllen  &  Koomen  83] 

Allen,  James  F.,  and  Koomen,  Johannes  A 

Planning  Using  a  Temporal  World  Model 

In  Proceedings  of  the  Eighth  International  Joint  Conference  on 
Artificial  Intelligence,  pages  741-T4T.  The  International  Joint 
Conference  on  Artificial  Intelligence  (IJCAl),  Karlsruhe,  W.  Germany, 
August.  1983. 

(Fikes  it  Nilsson  Tl] 

Fikes.  R..  and  Nilsson.  N.J. 

STRIPS.  A  new  approach  to  the  application  of  theorem  proving  to 
problem  solving 

Artificial  Intelligence  2.189-208.  1971. 

[Malik  k  Binford  83] 

Malik,  J.  and  Binford,  T.  0. 

Reasoning  in  Time  and  Space. 

In  Proceedings  of  the  Eighth  Int'l.  Joint  Conference  on  Artificial 
Intelligence,  pages  343-345.  The  International  Joint  Conference 
on  Artificial  Intelligence  (IJCAl),  Karlsruhe,  W.  Germany.  August, 
1983. 

[Schmolze  86]  Schmolze,  J  G 

Physics  for  Robots:  Representing  Everyday  Physics  for  Robot 
Planning. 

PhD  thesis.  The  University  of  Massachusetts.  Amherst.  1986. 

[Simmons  83]  Simmons,  R.  G. 

The  Use  of  Qualitative  and  Quantitative  Simulations 

In  Proceedings  of  the  Third  National  Conference  on  Artificial 

Intelligence  (AAAI-SS).  The  American  Association  for  Artificial 
Intelligence,  Washington.  DC..  August.  1983 

[Valdes-Perez  86] 

Valdes-Perez.  R.  E. 

Spatio-Temporal  Reasoning  and  Linear  Inequalities. 

1986, 

Unpublished  A.l  Memo,  Massachusetts  Institute  of  Technology  Artificial 
Intelligence  Laboratory. 


E&Vi  ti;!l  fcM  i  555  I  S  CBS  S2S  SSiJ  SKa  SLJ  Wi  I  IS 


Report  No.  8838 


BBN  Laboratories  Inc. 


3.  PHYSICS  FOR  ROBOTS 

James  G.  Schmolze 

Abstract 

Robots  that  plan  to  perform  everyday  tasks  need  knowledge  of  everyday  physics. 
Physics  For  Robots  (PFR)  is  a  representation  of  part  of  everyday  physics  directed 
towards  this  need  It  includes  general  concepts  and  theories,  and  it  has  been  applied 
to  tasks  in  cooking.  PFR  goes  beyond  most  A1  planning  representation  schemes  by 
including  natural  processes  that  t  robot  can  control.  It  also  includes  a  theory  of 
material  composition  so  robots  can  identify  and  reason  about  physical  objects  that 
break  apart,  come  together,  mix,  or  go  out  of  existence.  Following  on  Naive  Physics 
(NP).  issues  about  reasoning  mechanisms  are  temporarily  postponed,  allowing  a  focus 
on  the  characterization  of  knowledge  However,  PFR  departs  from  NP  in  two  ways  (1) 
PFR  characterizes  the  robot's  capabilities  to  act  and  perceive,  and  (2)  PFR  replaces 
the  NP  goal  of  developing  models  of  actual  common  senif-  knowledge  Instead,  PFR 
includes  all  and  only  the  knowledge  that  robots  need  for  planning  which  is  determined 
by  analyzing  proofs  showing  the  effectiveness  of  robot  10  programs 


BBN  Laboratories  Idc. 


Report  No,  6636 


3.1  iDtroductioD 

Physics  For  Robots  (PFR)  represents  knowledge  of  everyday  physics  according  to  the 
physical  capabilities  and  planning  needs  of  robots  This  knowledge  is  intended  to  be 
an  important  part  of  the  overall  knowledge  given  to  a  robot.  Physical  capabilities  are 
represented  within  PFR  by  specifying  the  perceptual  and  action  functionality  of  a 
(hypothetical t  robot  This  specification  is  comprised  by  an  1,  0  programming  language, 
whose  primitive  instructions  correspond  to  primitive  perceptions  and  actions,  and  an 
operational  semantics,  which  describes  the  real  world  effects  of  executing  1/0 
programs.  (Given  the  complexity  of  the  real  world,  this  semantics  is  necessarily 
incomplete.)  The  hypothetical  robot  used  for  this  research  has  capabilities  that  are 
beyond  current,  but  are  within  near  future  technology.  Some  of  the  robot’s 
capabilities  and  an  1/0  program  are  presented  later  m  this  paper 

PFR's  representation  of  everyday  physics  is  very  similar  in  .style  to  Hayes'  Naive 
Physics  (NP)  formalizations  [2,  3]  Like  NP.  PFR  focuses  on  characterizing  knowledge 
while  postponing  implementation  considerations  However,  NP  is  ultimately  after 
realistic  models  of  common  sense  (see  [2],  page  5)  whereas  PFR  is  after  the  knowledge 
that  robots  need  to  plan  for  everyday  tasks.  As  a  result.  PFR  includes  a  specification 
of  the  robot's  10  capabilities  whereas  NP  postpones  such  considerations  More 
importantly.  PFR  includes  a  criteria  for  judging  the  valu.=>  of  its  epresentation' 
whereas  NP  must  rely  on  the  existing,  and  small,  body  of  what  is  known  about  common 
sense  along  with  one  s  own  intuitions 

One  begins  to  evaluate  a  PFR  representation  by  selecting  a  set  of  everyday  tasks  for 
the  robot  to  perform,  and  for  each  task,  designing  an  10  program  that,  when 
executed,  will  cause  the  robot  to  successfully  perform  the  task  An  1  0  program  is 


3-2 


Report  No.  8636 


BBN  Laboratories  Inc. 


one  whose  primitive  instructions  are  only  perceptions  and  actions  for  the  robot  to 
perform  (see  Section  3.4)  The  test  for  PFR  is  whether  or  not  its  theory  of  everyday 
physics  IS  adequate  to  prove  that  the  execution  of  each  program  will  accomplish  its 
corresponding  task.  The  more  prograics  tasks  that  can  be  proven  correct  using  a  PFR 
representation,  the  greater  the  PFR  s  vpressive  power  and  the  better  the  PFR 
’  rther,  given  two  expressively  similar  PFR  representations,  one  should  choose  the 
s  r  of  the  two,  and  one  should  choose  the  representation  that  is  mos*  in  keeping 
with  what  IS  known  about  common  sense 

1  point  out  that  there  are  two  notions  of  correctness  here  One  is  whether  or  not 
executing  a  program  will  actually  accomplish  the  given  task  in  the  real  world.  PFR 
cannot  be  used  to  show  this  directly.  For  hypothetical  robots,  only  informal 
arguments  can  used  here.  For  actual  robots,  the  programs  can  be  executed  and  the 
robots  observed.  The  second  notion  of  correctness  corresponds  to  whether  or  not 
executing  the  program  accomplishes  the  task  according  to  the  theories  of  a  PFR 
representation.  The  extent  to  which  these  two  notions  of  correctness  are  in 
agreement  is  the  extent  that  the  representation  is  successful, 

3.2  Composition  of  Materials 

Physical  objects  in  the  everyday  world  can  come  into  or  go  out  of  existence,  break 
apart,  come  together  or  mix.  Examples  from  cooking  include  water  that  boils  and 
turns  to  steam,  or  the  pouring  of  hot  water  over  coffee  grounds  to  create  a  cup  of 
coffee  PFR  must  provide  the  robot  with  knowledge  to  ueal  with  such  phenomena  by 
giving  it  a  theory  of  material  composition  Such  a  theory  provides  a  robot  with  the 
skills  to 

c  identify  physical  objects  as  thev  come  into  or  go  out  of  existence,  or  through 
transformations,  and 

o  determine  the  properties  of  whole  objects  from  the  properties  of  their  parts,  and 
vice  versa,  including  when  the  parts  are  not  readily  identifiable  (rucii  as  the 
portion  of  the  hot  water  that  went  into  a  cup  of  coffee). 


BBN  Laboratories  Inc. 


Report  No.  6636 


My  theory  of  material  composition  includes  three  components.  (1)  a  theory  of  what 
constitutes  the  physical  objects.  (2)  the  part -whole  relation  along  with  a  theory  that 
identifies  parts  from  wholes  and  vice  versa,  and  (3)  a  theory  that  determines  the 
properties  of  parts  from  the  properties  of  wholes  and  vice  versa  In  this  paper.  1  will 
only  touch  on  (1)  and  (2).  and  will  ignore  (3)  completely  given  that  1  will  focus  on 
processes.  {See  [5]  for  a  fuller  treatment  of  material  composition.) 

Before  discussing  physical  objects.  1  now  introduce  some  basic  elements  of  PFR, 
Instants  of  time  are  represented  as  individuals  where  they  form  a  continuum.  Let 
"seconds"  map  real  numbers  to  instants  where  "seconds(n)"  denotes  n  seconds.  Points 
in  space  form  a  3-dimensional  continuum  Changing  relations  are  represented  as 
functions  on  instants  of  time.  Formulas  and  terms  for  these  relations  are  written  with 
the  time  argument  separated.  For  example.  "occ.space(x){t)"  denotes  the  set  of  points 
in  space  that  x  occupies  at  time  t.  '’occ.space(x)(t)"  is  defined  iff  x  is  a  physical 
object.  I  IS  an  instant  of  time,  and  x  exists  at  t  Further,  x  must  occupy  a  non-empty 
set  Also,  "vol(x)(t)"  denotes  the  volume  occupied  by  a  physical  object  at  time  t. 
which  is  defined  as  the  volume  of  "occ.space(x)(t)".  and  which  is  greater  than  2ero  for 
existing  physical  objects. 


% 

DC 


In 


i 


A  quantity,  borrowed  from  Hayes  [2].  is  a  set  of  measurements  of  a  given  type  For 
example,  the  temperatures  and  the  volumes  each  form  a  quantity.  Each  quantity  forms 
a  continuum  1  will  introduce  functions  from  the  reals  to  various  quantities,  in  the 
style  of  Hayes,  as  needed  For  example.  "cups<4)"  denotes  a  volume  of  4  cups. 


Types  that  are  not  time-varying  are  called  basic  types  An  example  is  being  a 
physical  object  or  a  temperature.  (See  (5]  for  the  reasons  for  the  above  design 
choices. ) 


Regarding  notation,  boolean  function  names,  le  .  predicate  names,  will  be  capitalized 


3-4 


[s* 


.«j 


c.” 


Report  No.  6636 


BBN  Laboratories  Inc, 


I 

i 


i 

i 

I 


H 


(V) 


I 

Itr 


Other  function  names  are  written  in  all  lower  case  Names  of  constants  are  written  in 
all  capital  letters  Names  of  variables  are  written  in  lower  case  Variable  names 
beginning  with  "t"  are  implicitly  of  type  "Instant",  which  denotes  the  basic  type  for 
instants  of  time  I  will  write  "(t^  tj)"  to  denote  the  open  interval  irom  t^  to  i2-  Also, 

1  will  use  the  following  shorthand  when  a  time  varying  predicate,  say  P,  is  true  over 
an  pen  interval. 

(vieft,  t2)](P(t)]]  (1) 

Being  a  physical  object  is  a  basic  type,  and  1  write  "Phys.obj(x)''  when  x  is  an 
individual  physical  object.  In  order  to  represent  physical  objects  coming  into  and 
going  out  of  existence,  1  introduce  existence  as  a  property  of  physical  objects.  Let 
''Exists(x)(t)"  be  true  when  x  is  a  physical  object  that  exists  at  time  t  Physical 
objects  include  those  objects  normally  considered  as  such,  e.g.,  books,  cars, 
computers,  the  atmosphere,  oceans  and  glasses  of  water.  However,  for  certain  types 
of  transformations  that  physical  objects  undergo,  it  will  be  useful  to  include  very  small 
physical  objects  —  nossibly  objects  at  the  level  of  atoms  and  molecules.  For  example, 
the  process  of  evaporation  can  be  described  by  having  small  pieces  of  liquid  turn  to 
gas  and  leave  the  container  holding  the  liquid  Also,  by  adding  some  sugar  to  water 
and  stirring,  the  entire  glass  of  water  becomes  sweet  By  using  small  pieces  again, 
one  can  describe  mixtures  and  show  the  spread  of  the  sweetness  as  a  dispersion  of 
small  pieces  of  sugar  When  hot  water  is  poured  over  coffee  grounds,  a  new  object  is 
created,  coffee  It  too  is  a  mixture,  which  can  be  useful  for  determining  that,  say, 
the  coffee  is  hot  because  it  is  primarily  composed  of  pieces  of  water  that  were  hot 
just  a  few  seconds  earlier 


Hayes  (  [3],  page  74)  eschews  an  atomistic  theory  because  he  considers  it  to  be 
beyond  the  realm  of  common  sense  In  traditional  physics,  there  is  a  complicated  gap 
to  bridge  between  the  microscopic  and  macroscopic  versions  of  certain  properties  such 
as  temperature,  volume  and  state  Does  the  robot  need  to  know  about  aciual  atoms 

and  molecules,  and  if  not,  what  simpler  theory  will  meet  the  robot's  needs? 

3-5 


BBN  Laboratories  Inc. 


Report  No.  6636 


Fortunately,  there  is  a  way  to  meet  the  robot's  needs  without  introducing 
microscopic  versions  of  temperature,  volume  and  state.  To  this  end,  1  invent  a  class 

of  physical  objects  that  1  call  granules.  Their  essential  properties  are  that. 

o  they  are  small  enough  to  be  a  part  of  all  solid,  liquid  and  gaseous  physical 

objects  —  they  are  too  small  to  be  seen  individually. 

o  they  are  large  enough  to  have  the  usual  macroscopic  properties  of  temperature, 
state  and  volume  (each  has  a  volume  greater  than  zero), 

o  they  are  pure,  be  they  purely  water,  wood,  or  whatever,  and 

o  they  have  no  proper  parts,  and  consequently,  no  two  granules  share  parts  nor 
occupied  space. 

Further,  granules  of  the  same  type  are  similar  For  example,  two  water  granules  with 
the  same  heat  content  will  have  the  same  temperature  and  state.  Granules  form  the 

smallest  physical  objects  in  my  ontology.  1  let  "Granule"  denote  a  basic  type  for 

granules. 

By  coupling  the  part-whole  relation  with  granules.  1  have  a  powerful  tool  for 

describing  material  composition.  Let  ”Part(x,y)(t)"  be  true  iff  x  and  y  are  physical 
objects  that  exist  at  t  and  x  is  a  part  of  y  at  t  "Part"  forms  a  partial  order  over 
existing  physical  objects  at  each  instant  From  these  relations.  1  can  define  a  function, 
called  "gset",  from  physical  objects  to  the  sets  of  granules  that  comprise  them  at  an 
instant. 

[Vy,t][gset(y)(t)  =  )x|Granule(x)  a  Part(x,y)(t){]  (2) 

1  will  use  the  ability  to  determine  an  object  s  "gset”  as  the  criteria  for  identifying  the 
object  For  example,  let  there  be  a  glass  called  G  that  contains  some  liquid  at  time 
T  If  G  and  T  are  identified  to  the  robot,  1  can  identify  the  hquid  In  C  as  W  with  the 
following. 

gset(W)(T)  =  jx|Granule(x)  LiquidtxHTi  (3) 

Contains(G.x)(T)i 

"Liquid(x)(t)"  IS  true  iff  x  exists  and  is  entirely  liquid  at  t  ("Solid"  and  "Gas"  are 
defined  similarly  for  the  solid  and  gaseous  states.)  "Contains(x.y)(t)"  iff  x  and  y  exist 
and  X  contains  y  at  t.  Borrowing  from  Hayes  ('3].  1  have  used  containment  to  identify 
this  liquid  object. 


Report  No.  6636 


BBN  Laboratories  Inc. 


1  can  go  a  step  further  and  write  a  general  rule  that  allows  the  robot  to  identify  a 
contained  quantity  of  liquid  as  a  physical  object,  The  first  line  in  Formula  4  requires 
that  there  is  some  liquid  in  a  container  and  the  remainder  asserts  the  existence  of 
the  object  formed  by  all  the  liquid  in  the  container 

[vc.t  ][((  :Zx][Contains(c,x)(t)  a  Liquid(x)(t)])  —>  (4) 

[ai][Phys.obj(l)  A  Exists(l)(t)  a 

gset(l)(t)  =  jylGranule(y)  a  Liquid(y)(t)  a 

Contains(c,y)(t)j]] 

Here,  x  can  be  a  single  liquid  granule 

Space  does  not  permit  a  thorough  examination  of  the  utility  of  granules  The 
interested  reader  should  refer  to  [5]  where  there  are  rules  that  allow  the  robot  to 
identify  liquid  objects  that  are  poured  elsewhere,  are  mixed  with  other  liquids, 
partially  evaporate,  etc.  In  addition,  there  are  rules  that  allow  the  robot  to  infer 
various  properties  of  these  transformed  objects,  such  as  their  temperature,  volume  or 
composition,  all  without  special  knowledge-  about  the  properties  of  microscopic  objects 
Further,  the  robot  needs  to  reason  about  granules  only  when  necessary;  it  can  reason 
about  normal  physical  objects  without  considering  granules.  The  general  PFR 
representation  thus  far  allows  a  wide  variety  of  such  rules  to  be  formulated.  However, 
the  actual  rules  for  identifying  objects  to  be  given  to  a  particular  robot  will  be 
application  dependent, 

3.3  Simple  Processes 

Any  robot  that  deals  with  the  everyday  world  must  be  able  to  predict  changes  due 
to  nature.  An  important  source  of  natural  changes  is  natural  processes,  and  so,  PFR 
includes  them,  1  have  limited  my  study  to  a  class  of  process  types  that  1  call  simple 
All  simple  process  types  have  an  enabling  condition  and  an  effect,  both  of  which 
depend  only  on  the  physical  condition  of  the  world  (and  not  on,  say,  the  intention  of 


BBN  Laboratories  Inc. 


Report  No.  6636 


any  agent).  Basically,  an  instance  of  a  simple  process  type  occurs  when  and  only 
when  the  enabling  condition  is  true  for  some  set  of  physical  objects,  and  the  process 
has  the  given  effect  on  the  world  while  it  is  occurring  For  example,  whenever  a 
faucet's  knob  is  open,  water  flows  from  the  faucet.  Or,  whenever  two  physical  objects 
are  of  different  temperatures  and  are  in  thermal  contact,  heat  flows  from  the  hotter 
to  the  cooler  object.  1  note  that  many  real  processes  are  not  simple. 

Given  instances  of  simple  process  types  (i  e..  simple  processes),  a  robot  must  be  able 
to  determine  when  they  occur,  how  to  identify  them  {e.g  .  deciding  when  two  processes 
are  the  same  or  different),  and  what  their  effects  are  Further,  these  factors  must  be 
determinable  from  limited  information.  For  example,  it  must  be  possible  to  determine  a 
process  effects  without  knowing  when  the  process  will  end.  Also,  the  manner  of 
describing  effects  must  allow  for  either  discrete  or  continuous  changes  For  example, 
beat  is  measured  on  a  continuum,  so  heat  transfer  causes  continuous  changes. 
However,  water  flowing  from  a  faucet  is  (eventually)  measured  by  the  transfer  of  whole 
water  granules,  so  faucet  flow  causes  discrete  changes.  Finally,  the  representation 
must  allow  for  situations  where  several  processes  affect  the  same  property  of  the  very 
same  objects,  such  as  a  heating  and  cooling  process  occurring  simultaneously  on  the 
same  pot  of  water, 

1  note  that  Hayes  [2,  3]  does  not  address  these  points  directly  Others,  such  as 
[l]  and  [4]  have  addressed  some  but  not  all  of  them 

1  represent  simple  processes  as  individuals.  Let  "Occurs!  x  )(t )"  be  true  iff  x  is  an 
event  that  is  occurring  at  time  t.  "Occurs"  for  events  is  analogous  to  "Exists"  for 
physical  objects. 

1  will  illustrate  the  essential  properties  of  simple  process  types  by  describing  the 
process  type  for  water  flowing  from  a  kitchen  faucet  Along  with  that.  1  will  describe 


Report  No.  6636 


BBN  Laboratories  Inc. 


faucets,  objects  associated  with  faucets  (such  as  their  controlling  knobs),  and  their 

operation.  Let  "Faucet. flow"  be  a  basic  type  for  faucet  flow  processes.  Each  simple 

process  has  a  set  of  players,  i.e.,  the  physical  objects  that  are  involved  For 

"Faucet. flow",  the  only  player  is  the  faucet,  with  which  1  associate  other  objects  In 

my  model,  a  faucet  has  a  knob,  a  head,  a  sink,  a  supply  container  that  holds  the 

faucet's  supply  and.  of  course,  the  water  in  the  supply  container  Let  "Kitchen. faucet" 

and  "Faucet. knob"  be  basic  types  for  kitchen  faucets  and  their  controlling  knobs. 

respectively  The  knob  has  fully  closed  and  fully  open  positions,  and  there  are 

positions  in  between.  Let  "closed. positiontkXt)"  denote  the  space  that  a  faucet  knob. 

k,  must  occupy  in  order  to  be  fully  closed  at  time  t.  Let  "open.position(k){i)"  be 

similar,  but  for  the  fully  open  position  From  these  functions,  1  can  define 

"Closed. knob(k)(t)"  as  true  iff  k  is  a  faucet  knob  that  is  fully  closed  at  t  and 

"Open  knob(k)(t)"  as  true  iff  k  is  a  fully  open  faucet  knob  at  t. 
[Vk,t][Closed.knob(k)(t)<-^Faucet.knob(k)  a  (5) 

occ.space(k)(t)  =  closed.position(k)(t)]  a 
[Vk.t][Open.knob(k)(t  l•e->Faucet.knob(k)  a 
occ.space(k)(t)  =  open.position(k)(t)] 

If  neither  is  true,  the  knob  is  in  between.  In  addition,  let  "knob.of .faucet(f)(t)", 
"supply. cont. of. fauret(f)(t)"  and  "supply. of. faucet(f)(t)"  denote  the  existing  knob,  supply 
container  and  water  supply,  respectively,  of  f  when  f  is  an  existing  faucet. 

The  enabling  condition  for  the  "Faucet. flow"  process  type  is  written  over  an  interval 

of  time  (1  will  soon  explain  why  I  and  is  true  iff  a  faucet,  f.  is  not  fully  closed  over 

some  open  interval,  "(t^.  tj)".  The  following  is  written  with  f.  t^  and  tj  free,  k  is 

used  to  simplify  the  formula. 

[Vte(t,„t2)]['>'Closed.knob(k)(t)]  (C) 

where  "k"  is  "knob.of.faucet(f)(t)" 

1  will  write  "Faucet. not. closedlfXt^.tj)"  as  a  shorthand  for  Formula  6 

The  effect  of  a  "Faucet. flow"  process  is  that  water  flows  from  the  faucet's  supply 
container  to  a  receiving  container,  which  is  either  the  faucet  s  sink,  or  an  open 


BBN  Laboratories  Inc. 


Report  No.  6636 


container  under  the  faucet's  head  To  describe  the  effect,  1  rely  on  two  defined 
predicates,  "Liq.xfer"  and  "rate.liq.xfer"  (only  'rate.liq.xfer"  will  be  formally  presented 

here;  "Liq  xfer(c^ .Cj.t^.t^)"  is  true  iff  the  following  holds. 

1,  There  is  some  liquid  in  a  container,  c^.  at  t^^, 

2.  Throughout  the  open  time  interval  from  to  where  granules  from  the 

liquid  in  c^  are  transferred  to  a  different  container,  Cj  The  transfer  could  have 
begun  before  t^  and  could  have  ended  after  "Liq.xfer”  only  states  that  a 

transfer  occurred  throughout  the  particular  interval  "(tj^  Further,  the 

liquid  need  not  remain  in  Cj  (e  g  ,  it  could  be  transferred  elsewhere). 

"rate.liq.xferic^.Cj.tjj,!^)"  denotes  the  average  rate  of  a  liquid  transfer  satisfying 
''Liq.xfer(c^,C2,tjj,t^)".  It  is  just  the  volume  of  the  liquid  actually  transferred  divided 
by  the  time  of  transfer.  I  calculate  this  volume  by  summing  over  the  volumes  of 
granules  transferred  since  (1)  all  the  liquid  that  is  transferred  may  not  form  a  single 
individual  (e  g.,  if  part  of  it  was  transferred  elsewhere  from  Cj  during  "(tj^.  and 

(2)  granules  share  no  parts,  so  1  will  get  an  accurate  measurement  of  volume  Since 
the  number  of  granules  transferred  is  discrete,  1  place  a  minimum  length  on  the  lime 
interval  over  which  this  rate  can  be  calculated  --  this  minimum  being  large  enough 
so  that  a  reasonably  large  number  of  granules  are  certain  to  have  transferred.  If 
these  intervals  are  allowed  to  be  arbitrarily  small,  inaccurate  measurements  can 
result.  Let  denote  this  minimum  interval  length,  which  1  set  to  one-tenth 

second, 

(y) 

[r=liq.xfer  retetc  1  Cj.tj^  t^)*^Liq  xfer(c^  Cj.tj^.t^)  " 

-  ’  vol,gset(c(c.|  Cj.tjj.t^Xl^))] 

where 

v(c,,C2,tb,te)(t^)=  (8) 

|x|Granule(x)  ' 

[3tiG(t^,  t^),t2e(t^.,t^)] 

[t^<t2  A  Liquid(x)(^^  / 

Contains(c^,x)(t^)  a  Contains(c2,x)(t2>]i 

and  where  ''vol,gset(x)(t)''  is  just  the  volume  of  a  set  of  existing  granules,  x,  at  time  t 


Report  No.  6636 


BBN  Laboratories  Inc. 


[vx,y,t][y=vol.gset(x)  (9) 

Set{x)  A  [Vz€'x][Granule(z)  a  Exists(z)(t)]  a 

y=i:vol(z)(t)] 


'Set(x)"  IS  true  iff  x  is  a  set. 


1  define  the  effect  of  a  "Faucet. flow"  process  to  be  that,  if  the  faucet  is  fully  open. 


water  transfers  from  the  faucet's  supply  container  to  a  receiving  container  at  the  rate 


of  one-quarter  cup  per  second  If  it  is  partially  open,  the  rate  is  between  one- 


sixtieth  and  one-quarter  cup  per  second  (this  is  idealized  to  simplify  its 


presentation).  The  following  describes  the  effect  of  a  "Faucet. flow"  process,  p,  that  is 


occurring  during  "(t^  .tj)"  (remember,  for  p  to  occur,  the  faucet  must  not  be  closed). 


Let  "faucet. of. flow(p )"  denote  the  faucet  involved  with  p.  c.  r  and  k  are  introduced  to 


simply  the  formula 

Liq.xfer(c.r,t^,t2)  a  ( 

[open.knob{k)^^  > 

,  ,  cups{l)  1 

rate  liq  xfer{c,r,t,,t,)ss - - J  a 

’  ^  seconds(4) 

[-(open.knob(k)(j^ 

cups(l)  ,  .  cups(l)  1 

- — -  <rate.liq.xfer(c,r,t,.t,)< - — J 

seconds(60)  ~  '  ^  “seconds(4) 


where 

"c"  IS  "supply. cont. of. faucet(faucet. of. flow(p))(t)" 
"r"  is  "receptacle. of .flow(p)(t)" 

"k"  IS  "knob.of.faucet(faucet  of  flow(p))(t)" 


'receptacle  of. flow"  is  a  function  that  is  defined  using  geometrical  primitives.  1  will  not 


discuss  it  in  this  paper  except  to  state  that,  for  a  "Faucet. flow"  process,  it  refers 


either  to  the  faucet's  sink  or  to  an  open  container  directly  below  the  faucet's  heed 


For  the  formulas  that  follow.  1  will  use  "Effect(p)(t^ .tj)”  to  refer  to  Formula  10. 


The  effect  of  a  water  flow  process  is  written  over  an  interval  of  time  because  there 


IS  a  discrete  quantity  being  measured,  as  1  explained  above.  For  this  reason.  1  will 


place  a  minimum  length  on  the  intervals  over  which  the  effect  of  a  faucet  flow  process 


BBN  Laboratories  Inc. 


Report  No.  6636 


IS  calculated  (as  will  be  seen  in  Formula  15>  Let  denote  this  minimum,  which. 


like  "Atj^ij",  IS  one-tenth  second.  For  simple  process  types  whose  effects  can  be 


measured  on  a  continuum.  "At^jl"  is  zero,  making  it  possible  to  describe  such  process 


types  using  instantaneous  rates,  if  desired.  1  note  that  enabling  conditions  are 


expressed  over  intervals  for  similar  reasons,  although  for  the  enabling  condition  of 


"Faucet. flow",  there  is  no  need  for  a  minimum  length  interval. 


There  are  'j  essential  properties  of  simple  process  types  For  each,  1  include  a 


formula  written  for  "Faucet. flow"  that  describes  the  property.  Each  simple  process 


type  will  have  5  similar  formulas. 

1.  An  instance  begins  when  (or  just  after)  the  enabling  condition  goes  from  false  to 
true  for  some  set  of  players,  t^^  represents  the  beginning  time  for  a  process. 

_Vf;Kitchen. faucet, tj^]  (11) 

<-[at][t<tj^  A  Faucet.not  closed(f)(l,tj^)]  a 
[3t][t>tj^  A  Faucet. not. closed(f)(tjj,t)]  — > 

[sp] [faucet. flow(p)  a  f=faucet.of.flow(p)  a 
[Vt]tt<tjj  '-Occurs(p)(t)]  A 
(Vt][t>tj^  A  Faucet  not. closed(f)(t^,t)  — ^ 

Occurs(p)^^  ..t)]]] 


i.e.,  for  appropriate  t^^'s.  a  faucet  flow  process  begins  at  tj^  whose  player  —  its 
faucet  --  IS  f  and  which  continues  while  the  faucet  is  not  closed 

2.  An  instance  continues  as  long  as  the  enabling  condition  remains  true  for  those 
players. 


VfiKitchen.faucet.t^.tj] 

t^<t2  A  Faucet  not. closed(f)(t,,t2)  — > 

[sp]  [Faucet. flow(p)  a  f=faucet.of  flow(p)  a 
Occurs(p)(,  ^  )]] 


3  An  instance  ends  when  (or  just  before)  the  condition  first  becomes  false  after  the 
process  starts  for  those  players  t^  represents  the  ending  time  for  the  process. 


3-12 


Report  No.  6636 


BBN  Laboratories  Inc. 


Vf:Kitchen. faucet, t^]  (13) 

[at][t<t^  A  Faucet.not.closed(f)(t,t^)]  a 
'-[3t][t>t^  A  Faucet.not.closed(f)(t^.t)]  — > 

[ap] [Faucet. flow(p)  a  f=faucet.of.flow(p)  a 
[Vt][t>t^  ■>'Occurs(p)(t)]  A 
[Vt][t<t  j  A  Faucet. not. closed(f)(t.t^)  — > 

Occurs(p)(^ 

i.e.,  for  appropriate  t^'s.  a  faucet  flow  process  ends  at  t^  whose  player  —  its 
faucet  —  IS  f  and  which  has  continued  for  as  long  as  the  faucet  has  not  been 
clos ’d. 

4.  If  two  individual  simple  processes  of  the  same  type  and  with  the  same  players 
overlap  in  the  times  of  their  occurrences,  they  are  the  very  same  process. 

Vp^;Faucet. flow, pjiFaucet. flow]  ( 14) 

faucet. of. flow(p,)=faucet. of. flow(p2)  a 

[at][Occurs(p,)(t)  A  0ccurs(P2)(t)]  ->  P,=P2] 

5  The  effect  applies  to  the  players  while  the  process  occurs  over  intervals  larger 
than  the  given  minimum  length. 

VpiFaucet.flow.t^.tj]  (IS) 

Occ'i«'s(p)(  t  ^ .  ,^)-^Effect(p)(t,.t2)] 

This  knowledge  allows  the  robot  to  determine  when  faucet  flow  processes  begin, 
continue  and  end.  It  provides  identity  criteria  for  these  processes  and  it  describes 
their  effect  in  the  real  world.  Thus,  the  robot  is  well  equipped  to  plan  to  control 
such  processes.  In  Section  3.5,  this  knowledge  is  used  to  show  the  effectiveness  of  an 
] '0  program. 

3.4  Robot  Perception  and  Action 

Anv  robot  that  plans  must  know  the  consequences  of  executing  its  perceptual  and 
action  routines,  i.e..  its  own  1/0  programs.  In  this  section,  1  specify  the  10 
functionality  of  a  hypothetical  robot  as  part  of  PFR 

In  order  to  describe  the  effects  of  executing  programs,  a  model  of  the  robot  s 
internal  state  and  capabilities  is  needed.  The  robot  can  move  about,  grasp  certain 


BBN  Laboratories  Inc. 


Report  No  6636 


kinds  of  objects  with  its  (single)  arm  and  hand,  end  can  determine  certain  kinds  of 
situations  by  "looking”  through  its  (single)  camera  eye.  Let  "Near(x)(t)"  be  true  iff  the 
robot  IS  near  object  X  at  t  To  be  near  an  object  means  that  the  robot  is  able  to  see 
It  and  reach  it.  "Grasped(x)(t)"  iff  the  robot  is  grasping  object  x  at  t  In  order  to  be 
grasped,  the  object  must  be  of  a  certain  shape,  which  1  denote  with  "Graspable(x)(t)". 
Only  one  object  can  be  grasped  at  a  time.  In  order  to  represent  the  robot  s  ability  to 
identify  and  find  objects  at  given  times,  1  introduce  ''ldentifiabie(x)(t )".  which  partially 
models  the  robot  s  internal  memory  state. 

The  1 '0  language  includes  calls  to  primitive  input  and  output  procedures, 
sequencing,  compound  statements,  if-then-else  statements  and  while  loops  Output 
procedure  calls  are  program  statements  Input  procedure  calls  are  program  functions. 
There  is  no  assignment  statement.  Constants  denote  individuals  such  as  physical 
objects  or  instants  of  time.  For  simplicity,  1  assume  that  the  execution  of  the  control 
portion  of  statements  takes  zero  time.  This  includes  calls  to  input  procedures,  so 
they  also  take  zero  time  to  execute.  Also  for  simplicity,  output  procedures  take  fixed, 
greater-than-zero  time  to  execute.  In  the  descriptions  that  follow,  each  output 
procedure  takes  2  seconds.  (For  a  full  specification,  see  [5].)  Let  "E(S)(t,.t2)"  denote 
the  execution  of  statement  S  by  the  robot  where  execution  begins  at  t,  and  ends  at 
tj.  such  that  a  new  statement  can  begin  executing  at  tj. 

grasp  X.  If  X  IS  identifiable,  graspable,  near  the  robot  and  nothing  is  already 
gi  .asped,  the  robot  will  grasp  x 

[vx,t,,t2][E(grasp  x)(t^,t2)->-  (16) 

t2-t^  =seconds(2)  a 

(ldentifiable(x)(t^)  A  Near(x)(t,)  a 
Graspable{x)(t^)  A  '-[3y][Grasped(y)(t,)] 

— >  Grasped(x)(t2)) ] 


open.knob  k  If  k  is  a  faucet  knob  that  is  currently  being  grasped,  this  causes  the 


3-14 


Report  No.  6836 


BBN  Laboratories  Inc. 


robot  to  move  k  (if  necessary)  to  its  open  position.  It  takes  2  seconds.  For 
simplicity,  1  assume  that  the  robot  knows  the  current  open  position  for  k  If  k  is 
already  open,  the  robot  takes  no  action.  If  k  is  not  open,  it  begins  to  move  k 
immediately.  At  some  point  during  execution  of  this  procedure,  k  is  in  the  open 
position,  after  which  the  robot  stops  moving  it  Before  describing  "open. knob ",  1 
define  "Stationary(x)(t^.t2) "  to  be  true  iff  x  does  not  change  location  from  t^  through 

^2 

[Vx,t,,t2][Stationary(x)(t,,t2) 

[VtG(t,..t2)][occ.space(x)(t)= 

occ.space(x)(t,)]) 

VkiFaucet.knob.t^.tj]  (18) 

E(open.knob  k^t^.tj)  t2-t,=seconds(2)  a 
(Grasped(k)(t,>  a  Open.knob(k)(t,)  — > 

Open.knob(k)(j^  a  Stationary(k)(t,.t2))  a 

(Grasped(k)(t,)  a  ">'Open.knob(k)(t,) 

[VtG(t,..t2)][occ.space(k)(t);^ 
occ.space(k)(t^)]A 
{3tG(t^  ..t2)][0pen.knob(k)(t)  a 

Open.knob(k)^j  ^  jA 

Stationary(k)(t,t2)]  a 
[VtG(t,..t2)][0pen.knob(k)(t)  — > 

Open.knob(k)(j 

close. knob  k  If  k  is  a  faucet  knob  that  is  currently  being  grasped,  this  causes  the 
robot  to  move  k  (if  necessary)  to  its  closed  position.  It  is  very  similar  to  the 


"open  knob"  procedure. 


BBN  Laboratories  Inc. 


Report  No.  6636 


[vk:F-aucet.knob,t^  .tj]  (19) 

[E(close.knob  k)(t^,t2)  — »  Ij-t^  =seconds(2)  a 

(Grasped(k)(t^)  a  Closed. knob(k)(t^)  — > 

Closed. knob(k)^^^  a  Stationary{k)(t,,t2))  a 

(Grasped(k)(t,)  a  '-Closed. knob(k)(t^)  — > 

[VtG(t^..t2)][occ.space(k)(l);^ 
occ.space(k)(t^)]  A 
[a'G(t^..t2)](Closed.knob(k){t)  A 

Closed. knob{k)^j  ^  jA 

Stationary{k){t.t2)]  a 

[VtG(t^..t2)](Closed.knob(k)(t)  — > 

Closed. knob(k)^j  t2)^)3 

release:  The  robot  releases  whatever  is  being  grasped.  It  takes  2  seconds 

(Vt,,t2](E(release){t^,t2)  — »  (20) 

t2-t^  =  seconds{2)  a  '-(ay](Grasped{y)(t2)]] 

Less.full(C,P):  An  input  procedure  that  is  true  iff  container  C  is  less  than  a  certain 
fraction  full  of  solid  and/or  liquid  material,  P  is  the  fraction  If  P  is  I.  then  this  is 
true  whenever  C  is  not  full  C  must  be  identified  beforehand  and  the  robot  must  be 
near  it.  The  robot  estimates  the  value  of  this  function  using  its  visual  capabilities 
along  with  knowledge  of  the  container's  shape.  However,  for  this  paper,  this  ability  of 
the  robot  is  idealized  Let  "ci-(P)(t)"  be  true  iff  the  evaluation  of  input  procedure  P  at 
time  t  would  be  true. 

[vt]  [ldentifiable(C)(t)  A  Near(C)(t)  — ^  (21) 

(Q(Less  full(C,P))(t) 

vol  gset(Z)(t)  <  p)l 

contained.vol(C)(t) 

where 

"Z"  IS  "jx|Granule{x) /N  Conlains(C.x)(t)  -GasfxiltH" 

Here,  "contained. vol(x)(t)"  denotes  the  maximum  volume  of  liquid  material  that  x  can 


contain  at  time  t 


Report  No.  6636 


BBN  Laboratories  Inc. 


3.5  Filling  a  Pot  wth  Water 

In  this  se'^tion.  1  present  an  10  program  that,  when  executed  under  given 
conditions,  will  cause  the  robot  to  partially  fill  a  pot  with  water  The  given  conditions 
are  that  a  pot  (P)  is  upright.  •  '  a  s.nh  (Si,  and  under  the  head  o'  a  faucet  (F)  that  is 
controlled  by  a  l:nob  (K)  with  a  water  supply  (W)  that  is  stored  in  a  supply  container 

(C)  K  IS  in  th  closed  position  The  robot  is  near  the  faucet. 

FP  grasp  K.  {22) 

Sj  open.knob  K, 

Sj.  while  Less. full(P. 0.5)  do  idle. for  seconds(0. 1 ). 

S^.  close. knob  K; 

Sj.  release, 

When  FP  is  executed,  the  robot  grasps  K  and  moves  K  to  the  open  position  At  this 
point,  water  begins  flowing  into  P  In  Sj,  the  robot  waits  until  the  accumulated  water 
occupies  more  than  half  of  P.  The  robot  then  closes  K  and  releases  it,  leaving  P  about 
half  full  of  water, 

PFR  can  be  used  to  show  the  effectiveness  of  the  FP  program,  The  ontology  and 
theories  presented  so  far  will  be  used  to  show  that  each  statement  of  FP.  when 
executed,  produces  a  set  of  conditions  needed  for  the  next  statement  execution,  and 
that  at  the  end,  the  FP  program  j.as  caused  the  robot  to  partially  fill  P  with  water. 
Furthermore,  1  will  demonstrate  how  the  robot  has  the  knowledge  to  infer  the  identity 
of  faucet  flow  process,  even  though  no  such  process  is  mentioned  in  the  FP  program 
1  will  only  sketch  a  proof  in  this  paper  (A  full  proof,  excluding  program  termination, 
of  a  similar  1  0  program  can  be  found  in  (5]  ) 

1  introduce  Tg  through  T^.  where  is  executed  from  T^  thrc.ugh  T^,  Sj  is  executed 
from  T^  through  Tj.  etc.  The  relevant  given  conditions  are 


3-17 


BBN  Laboratories  Inc. 


Report  No.  6636 


Faucet(F)Apot(P)/  (23) 

K--knob  of  faucet(F)(Tg)  a  W-supply  of  faucet(FMTg>  ' 

C= supply. con t. of. faucet(F)(Tg)  a 
Coiitains(C.W)fTg)  a  volfW)(Tg)>cups(  1000)  a 
Exists(F)(Tg)  a  Exists(K)(T0)  a  Exists(P)(Tg)  a 
ExistsfC)(Tg)  A  Exists(W)(Tg)  a 
contained. voHP)(Tg)  =  cups(l)  a  All.water(W)(Tg)  ■ 
ldentifiable(P)(Tg)  a  Identifiable^K^Tg)  a 
NearfP)fTg)  a  Near(K)(Tg)  a  Graspable^KHTg)  a 
Closed. knobfK)(Tg)  a  ~[3y][GraspeJ(y)(Tg)] 

Here  1  have  used  "Pot",  which  denotes  a  basic  type  for  kitchen  pots,  and 
"All.water(x)(t)",  which  is  true  iff  x  is  composed  entirely  of  water  granules  at  time  t 
(definition  not  shown  here). 


The  goal  is  that  P  contains  at  least  half  a  cup  of  water  at  time  Tg 
[31][Exists(l)(T^)  A  All  waterd  ldg)  a  (24) 

ContainsfP.OfTg)  a  vo1(1)(Tq)>cups(0.5)] 

Throughout  this  proof  sketch,  1  will  need  to  make  default  assumptions.  However,  1 
have  not  investigated  theories  for  making  appropriate  default  assumptions  in  this 
research  Instead.  1  will  simply  make  those  assumptions  that  are  needed  and 
reasonable.  As  a  result,  I  have  a  set  of  examples  that  a  theory  for  making  default 
assumptions  must  be  able  to  produce.  My  first  assumptions  correspond  to  conditions 

that  will  not  change  throughout  the  execution  of  FP, 

Default  assumption.  (25) 

[VtGfTg.Tj)] 

[K  =  knob.of.faucet(F)(t )  a  W  =  supply  of  feucet(F)(t )  • 

C  =  supply. cont. of, faucet (F)(t)  a 
Contains(C,W)(t)  vol(W)(t)>cups(  1000)  a 
Exists(F)(t)  A  Exists(K)(tt  /,  Exists(P)(t)  a 
ExiststOtt)  '■>  Exists<W)(t ) 

contained  vol(P)(t  )  =  cups(  1 )  a  All  water(W)(t)  a 
ldentifiable(P)(t )  a  ldentifiable(K)(t)  a 
Near(P)(t)  a  Near(K)(t)  a  Graspable(K)(t)] 

Additional  assumptions  are  needed  in  a  complete  proof,  such  as  that  certain  objects 
do  not  move  throughout,  that  the  open  and  closed  positions  for  K  do  not  change,  etc. 


3-18 


Report  No.  6636 


BBN  Laboratories  Inc. 


After  executing  S^.  the  knob  K  is  grasped,  l  e  ,  "GraspedlKXT^ This  follows 
trivially  since  the  given  condition  in  Formula  23  satisfies  the  condition  of  Formula  16. 

While  executing  Sj.  the  robot  moves  K  (the  currently  grasped  object)  to  its  open 

position.  Let  T'.|  denote  the  instant  that  K  first  becomes  fully  open,  after  which  it 

remains  open.  T  .j  is  in  the  interval  ”(T^..T2)"  Also,  according  to  Formula  18,  the 

robot  begins  to  move  K  immediately  at  T^. 

Open.knob(K)^j .  ^  y'  (2G) 

[VtG  (T^  ..T’.|  )][~Open  knob(KHt)  a  ~Closed.knob(K)(t)] 

For  similar  reasons,  during  the  execution  of  S^,  there  is  some  instant  when  K  becomes 

fully  closed  and  remains  closed  (using  Formula  19).  Let  this  instant  be  T'j,  which  is  in 

the  interval  "(Tj..T^)". 

Closed. knob(K)^j.,  ^  jA  127) 

[Vt&(Tj..T'3)]['-Open.knob(K)(t)  a  'vClosed.knob(K)(t)] 

I  will  now  show  that  a  "Faucet. flow"  process  begins  at  and  ends  at  T’j  However, 
first  1  make  the  default  assumption  that  K  remains  fully  closed  during  "(Tg.  T^)",  fully 

open  during  "(Tj  Tj)",  and  fully  closed  during  "(T^.  T^)”, 

Default  assumption.  (28) 

Closed. knob(K)(.^^  A  Open.knob(K)^.f^  •'''3)’^ 

Closed. knob(K)^j 

As  a  result,  K  is  fully  closed  before  T,  and  it  is  not  fully  dosed  just  after  (note 
that  nothing  needs  to  be  said  about  K’s  status  precisely  at  T,l  This  satisfies  the  left 
side  of  Formula  11  with  ''t|j=T^".  leading  me  to  conclude  that  there  is  a  "Faucet  flow" 
process,  which  I'll  call  FF,  with  F  as  its  "faucet  of  flow”,  that  begins  at  T.|  and 
continues  while  K  is  not  closed.  However,  Formula  1 1  will  not  let  me  conclude  that  FF 
ends  at  T'^.  Formula  13  is  needed  to  determine  process  endings.  Letting  "t^^T'j"  in 
Formula  13,  1  conclude  that  a  "Faucet  flow"  process,  which  I'll  call  FF2.  has  F  as  its 
'  "lucet.of.flow",  ends  at  T'j.  and  has  continued  for  as  long  as  K  has  not  been  closed 
Of  course,  there  is  only  one  process  here,  which  is  concluded  from  Formula  14  Since 


BBN  Laboratories  Inc. 


Report  No.  6636 


FF  and  FF2  use  the  same  faucet,  F,  and  their  occurrences  overlap  (e  g.,  at  Tj).  then 
■'FF2=FF'’. 

Faucet  flow(FF)  a  F=faucet.of.flow(FF)  a  (29) 

Occurs(FF)j.j.  .j..  ^  a  [Vtl[t<T^— >'-Occurs(FF)(t)]  / 

[Vt][t  >T'3-^'-0ccurs(FF)(t)] 

Thus,  the  robot  can  identify  a  faucet  flow  process  and  can  determine  its  times  of 
occurrence. 


I  Given  the  times  of  occurrence  of  FF,  1  can  now  determine  its  effect.  First.  1  assume 

that  P  receives  the  water  flowing  from  F  (space  does  not  allow  a  discussion  of  the 
necessary  geometry). 

[Vte(T.|..T'3)][P  =  receptacle. of. flow(FF)(t)]  (30) 

By  applying  the  formula  describing  the  effects  of  "Faucet. flow”.  Formula  15,  to  the 
above  times  for  FF's  occurrence,  29,  1  conclude  that  a  liquid  transfer  took  place  from 
C  to  P  during  "(T^..T'3)". 

Liq.xfer(C,P.T,.T-3)  (3t) 

So.  granules  are  accumulating  in  P  that  come  from  C  (i.e..  are  part  of  what  was  W). 
From  this,  1  can  conclude  that  water  is  accumulating  in  P  (and  if  1  added  more 
theories,  that  this  water  has  properties  similar  to  those  of  W,  such  as  being  either  hot 
or  cold).  Also,  given  that  FF  is  occurring.  1  can  conclude  the  approximate  rates  of 
transfer.  During  "(TY  Tj)".  it  transfers  at  the  maximum  rate  of  1  cup  every  four 
j  seconds.  During  the  other  times  it  transfers  at  a  rate  somewhere  between  1  cup  per 

j  minute  and  1  cup  per  4  seconds. 

i 

i 

1  now  make  the  default  assumptions  that  the  liquid  transferred  by  FF  remains  in  P 
throughout  execution  of  FP  and  that  it  remains  liquid  Also,  any  non-gaseous  object 
in  P  during  execution  of  FP  came  from  F  s  water  supply.  W. 


3-20 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


default  assumption.  (32) 

.Vx.te(Tg..T5)] 

_  (Liquid(x)(t)  a  Contains(P,x)(t)-» 

[Vt'0(t..T5)][Liquid(x)(t’)  A  Contains(P,x)(t')])  a 
(~Gas(x)(t)  A  Contains(P,x)(t)— > 
gset(x)(t)  s:gset(W)(T0))] 

Given  the  above.  1  conclude  that  P  will  continue  to  fill  with  water  and  that, 
eventually.,  "Less. fulK.P. 0,5)”  will  be  false  In  fact,  this  will  happen  between  0  and  2 
seconds  after  Tj,  taking  into  account  the  varying  rate  of  water  flow  and  the  fact  that 
the  time  of  T',  is  not  precisely  known.  Therefore,  Sj  takes  between  0  and  2  seconds 
to  execute,  and  the  entire  program  takes  between  8  and  10  seconds.  So,  the  robot 
should  begin  execution  at  "Tg=TQ-seconds(10)"  to  be  sure  P  will  be  filled  in  time  It 
turns  out  that  during  the  execution  of  S^,  another  half  cup  of  water  could  flow,  so  P 
will  be  between  half  and  completely  full. 


I 


$ 


1  am  nearly  at  the  given  goal.  Formula  24,  but  it  is  stated  in  terms  of  a  liquid 
object  and  not  in  terms  of  a  set  of  liquid  granules  that  are  contained  in  P.  Howevir, 
Formula  4  lets  the  robot  ideni’'*"  the  liquid  in  P  as  a  physical  object,  and  so  the  goal 
IS  achieved. 

3.6  Conclusions 


0 


k 


Physics  For  Robots  (PFR)  represents  the  everyday  physics  that  a  robot  needs  to  use 
in  planning  to  perform  everyday  tasks.  Using  a  PFR  representation  scheme,  a  robot 
can  reason  about  natural  processes  as  well  as  actions  It  can  take  into  account  the 
time  events  take,  the  gradual  changes  they  cause  and  the  fact  that  many  processes, 
once  initiated,  continue  without  further  attention.  Therefore,  it  can  plan  to  control 
many  processes  simultaneously.  PFR  also  specifies  identity  criteria  for  physical 


3-21 


BBN  Laboratories  Inc. 


Report  No.  6636 


objects  that  break  apart,  come  together,  mix.  or  come  into  or  go  out  of  existence. 
Therefore,  the  robot  can  plan  to  recognize  and  manipulate  objects  undergoing 
transformations,  and  co  determine  the  properties  of  these  objects  based  on  their 
material  composition. 


The  contributions  of  this  research  are 

o  a  strategy  to  develop  and  evaluate  representations  of  everyday  physics  for  robot 
planning, 

o  a  general  representation  for  part  of  everyday  physics  including  an  ontology  of 
time,  space,  physical  objects  and  events,  theories  governing  processes,  material 
composition,  etc 

o  an  application  specific  representation  describing  everyday  phenomena  from 
cooking,  such  as  water  flow  from  a  faucet,  etc. 


The  crucial  research  to  be  done  next  is  not  only  to  extend  these  types  of 
representations  to  more  areas,  but  to  use  these  results  to  design  reasoning 
mechanisms  that  will  allow  robots  to  plan  for  everyday  tasks. 


3.7  Acknowledgements 

Many,  many  thanks  go  to  David  Israel.  David  McDonald.  Candy  Sidner,  Brad  Goodman, 
N.  S  Sridharan,  Andy  Haas.  Marc  Vilain  and  Krithi  Ramamntham  for  their  ideas  and 
comments. 


3-22 


Report  No.  6636 


BBN  Laboratories  Inc. 


REFERENCES 


[1]  Forbus,  K.  D.  The  Role  of  Qualitative  Dynamics  in  Naive  Physics.  In  Formal 
Theories  of  the  Commonserise  World,  Ablex.  1985,  pp.  185-226. 

[2]  Hayes,  P.  The  Second  Naive  Physics  Manifesto  In  Formal  Theories  of  the 
Commonscnsc  World.  Ablex.  1985,  pp  1-36 

[3]  Hayes,  P.  Naive  Physics  1.  Ontology  for  Liquids.  In  Formal  Theories  of  the 
Commorisense  World.  Ablex.  1985.  pp.  71-108 

[4]  GG.  Hendrix.  "Modeling  Simultaneous  Actions  and  Continuous  Processes." 
Artificial  Intelligence  4  (1973),  145-180. 

[5]  Schmolze,  J  G.  Physics  For  Robots.  Ph.D.  Th..  University  of  Massachusetts, 
February  1986.  (Also  BBN  Laboratories  Report  No.  6222.  July  19861 


BBN  Laboratories  Inc. 


Report  No.  6636 


3-24 


VTIWT*  PJl  'Ji  rj^rjtrsM  "Jt 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


4.  THE  CASE  FOR  DOMAIN -SPECIFIC  FRAME  AXIOMS 


i 


I 


t. 


Andrew  R.  Haas 


Abstract 


Several  researchers  have  used  non-monotonic  logic  in  attempts  to  abolish 
frame  axioms  that  are  specific  to  one  domain  in  favor  of  a  universal  frame 
axiom  We  argue  that  the  universal  frame  axiom  cannot  work  in  a  domain  that 
allows  incomplete  descriptions  of  situations.  Therefore  domain-specific  frame 
axioms  are  needed  We  illustrate  an  approach  to  writing  these  axioms  by 
considering  frame  axioms  about  motion,  including  the  formalization  of  simple 
example. 


4.1  Introduction 


The  most  fundamental  problem  in  automated  plan  generation  is  the  frame 
problem  [7].  Briefly  stated,  the  frame  problem  consists  of  determining  those  aspects 
of  the  world  that  are  unaffected  by  the  performance  of  an  action.  This  is  a 
considerably  broader  class  of  properties  than  those  which  are  affected  by  the  action. 
Most  actions  typically  have  well-circumscnbed  effects,  and  leave  much  of  the  world 
unchanged  Moving  a  box  from  room  to  room,  for  example,  does  not  change  its  color, 
that  of  the  rooms,  the  ambient  temperature,  or  any  other  property  other  than  the 
location  of  the  box. 


General  solutions  to  the  frame  problem  have  unfortunately  been  very  elusive  In 
early  theorem-proving  planners  [4],  large  numbers  of  first-order  axioms  were 
provided  to  state  all  the  properties  that  an  action  left  unaffected  Not  only  was 
writing  these  frame  axioms  a  painstaking  endeavor,  but  the  preponderance  of  these 
axioms  in  the  theorem-proving  database  drastically  curtailed  the  performance  of  the 
planner.  As  a  result,  planning  researchers  essentially  abandoned  first-order 
formalizations  of  actions  and  plans  Instead  they  focused  on  alternative  solutions  to 
the  frame  problem,  such  as  the  add-  and  delcie-lists  of  STRIPS  and  its  manv 
descendants  [3,  8.  1,  among  many  others]. 


4-  1 


50 


BBN  Laboratories  Inc. 


Report  No.  6636 


I  ■ 

I 


« 

( 


I 


i 

I 

I 


I 


I 


I 

I 


Recent  efforts  at  formalizing  planning  have  turned  to  non-monotonic  solutions 
to  the  frame  problem.  Most  of  these  formalizations  are  centered  around  a  general 
non-monotonic  frame  axiom.  This  axiom  usually  sanctions  the  inference  that  a 
proposition  persists  from  some  state  in  which  it  is  true  to  a  later  state  if  it  can  not 
be  proven  that  the  proposition  has  been  changed  by  an  intervening  action  This 
inference  is  non-monotonic.  It  allows  the  planning  system  to  assume  something  (i.e  . 
a  lack  of  change)  on  the  basis  of  its  inability  to  prove  its  inverse  (i.e,.  change)  The 
advantages  of  this  approach  are  that  it  makes  unnecessary  the  multitudinous  frame 
axioms  of  the  early  formal  systems  In  this  paper  we  claim  that  this  move  towards 
non-monotonic  logic  was  premature  It  has  not  been  demonstrated  that  natural 
planning  problems  lead  to  huge  numbers  of  frame  axioms  since  only  the  most  obvious 
approaches  were  tried.  If  we  analyze  a  domain  more  deeply,  we  can  reduce  the 
number  of  frame  axioms  to  something  manageable  without  introducing  any  new  logics. 

The  key  is  to  use  the  standard  distinction  between  primitive  and  defined 
predicates.  For  each  primitive  predicate  there  is  a  frame  axiom  which  lists  all  the 
actions  that  can  change  that  predicate.  To  prove  that  action  A  dees  not  change  a 
primitive  predicate  P,  we  show  that  A  is  not  equal  to  any  action  on  the  list  of  actions 
that  can  change  P.  To  show  that  action  A  does  not  change  a  defined  predicate  Q.  we 
reduce  Q  to  primitive  predicates  using  its  definition  and  then  show  that  A  does  not 
change  those  predicates.  We  demonstrate  by  examples  that  in  problems  about  motion, 
a  non-obvious  choice  of  primitive  predicates  can  reduce  the  number  of  frame  axioms 
to  something  quite  tractable  In  general,  we  prefer  to  study  concrete  domains  rather 
than  considering  an  arbitrary  situation  calculus. 


4.2  An  Argument  Against  the  Universal  Frame  Axiom 


3 


V* 


5 


s 

;v 

cS 


Consider  a  problem  domain  in  which  actions  are  described  as  relations  over 
situations  and  situations  are  described  by  asserting  that  certain  conditions  do  or  do 
not  hold,  (change  A1  Cl  SI)  means  that  if  the  agent  performs  action  A1  in  situation 
SI  it  will  change  the  condition  Cl  There  is  a  non-monotonic  inference  method  whose 
effect  IS  this,  if  the  agent  cannot  prove  the  sentence  (change  A1  Cl  SI)  by  some  given 
proof  procedure,  then  the  agent  concludes  —(change  A1  Cl  SI).  We  consider  the  case 
in  which  .some  situations  are  not  completely  described  This  might  happen  either 
because  the  initial  situation  is  not  fullv  described,  or  because  some  of  the  actions  are 


Si 


4-2 


Report  No.  ^636 


BBN  Laboratories  Inc. 


I 


I 

I 


\m 


I 


i 


-1j » 


tif 


non-deterministic  Finally,  ve  assume  there  is  no  information  about  particular 
situations  concealed  in  the  proof  procedure  used  in  non-monotonic  reasoning,  Thus  if 
the  robot  has  limited  knowledge  of  situation  SI,  the  proof  procedure  can  produce  only 
limited  predictions  about  the  effects  of  performing  actions  in  situation  SI. 

Given  these  assumptions,  we  claim  the  non-monotonic  inference  method  cannot 

be  reliable.  Suppose  the  agent  lacks  complete  knowledge  of  a  situation  Si - either 

because  it  is  an  underspecified  initial  situation  or  because  it  is  the  result  of  applying 
a  non-deterministic  action  to  a  fully  specified  situation  Since  the  proof  technique 
cannot  make  up  Ic^  this  ignorance,  it  follows  that  there  are  true  statements  of  the 
form  (change  A2  C?  SI)  that  the  robot  cannot  prove.  By  non-monotonic  inference  the 
robot  concludes  that  these  statements  are  false.  Thus  non-monotonic  reasoning  is  not 
reliable. 

One  may  reply  that  a  ncn-monotonic  inference  method  is  allowed  to  make  some 
mistakes,  by  definition.  For  this  reply  to  be  convincing  one  would  have  to  show  that 
the  mistakes  will  be  rare  or  unimportant  But  it  may  well  be  that  there  are  many 
important  gaps  in  the  agent's  knowledge  of  the  situation  SI.  Then  there  are  likely  to 
be  many  important  statements  (change  A2  C2  SI)  that  the  agent  cannot  prove  and 
wrongly  concludes  are  false  The  universal  frame  axiom  is  reliable  only  when  nearly  all 
important  facts  about  each  situation  are  available 

We  have  assumed  that  the  non-monotonic  inference  method  applies  to  all 
situations,  no  matter  how  little  the  agent  knows  about  them.  One  aliernetive  is  a 
non -monotonic  inference  method  that  applies  only  when  the  agent  knows  the  relevant 
facts  about  the  starting  situation  We  might  have  ‘If  the  agent  knows  the  location  of 
every  object  that  is  inside  room  R  in  situation  S.  and  action  A1  takes  place  inside 
room  R.  and  the  agent  cannot  prove  (change  Al  C  SI),  then  ~(chanse  A1  C  SD".  Such 
an  axiom  may  indeed  be  useful,  but  it  is  not  domain-independent  -  that  is.  it  does 
not  apply  to  every  domain  in  which  actions  are  relations  over  situations  Yet  this  is 
what  advocates  of  the  universal  frame  axiom  mean  by  "domain-independent".  This 
suggests  that  a  weaker  notion  of  domain-independence  might  be  useful,  and  we  will 
return  to  this  possibility 

Hanks  and  McDermott  [5]  pointed  out  (hat  even  in  the  case  of  a  deterministic 
domain,  it  is  not  obvious  how  to  formalize  a  domain-independent  fiame  axiom  like  "if 
you  can't  prove  action  A  changes  condition  C  in  situation  S.  assume  it  doesn f 


4-3 


'jt  rLx, 


yj-f'. 


.‘V  'J*.  'A  ’a  J 


BBN  Laboratories  Inc. 


Report  No.  6636 


Various  authors  (e.g.,  Kautz  [6])  hove  shown  that  one  can  do  it  by  applying  this  frame 
axiom  to  the  steps  of  a  plan  in  order  of  their  occurence  These  solutions  are  all 
domain-independent,  so  by  the  above  argument  they  ought  to  produce  false 
conclusions  in  cases  where  there  are  incomplete  descriptions  of  situations.  For 
simplicity  assume  the  initial  state  is  underspecified.  Suppos®  the  axioms  sav  that  the 
action  Shoot  changes  the  condition  Alive  in  any  situation  S  where  Alive  and  Loaded 

are  true.  Suppose  the  initial  situation  SO  is  not  completely  described - it  is  known 

that  Loaded  is  true  in  SO.  but  not  known  whether  Alive  is  true  in  SO  or  not.  Then  the 
agent  cannot  prove  that  Shoot  changes  Alive  in  SO,  and  it  concludes  that  Shoot  does 
not  change  Alive  in  SO  It  then  follows  by  ordinary  monotonic  logic  that  Alive  was  false 

in  the  initial  situation.  This  is  the  wrong  conclusion - at  least  according  to  my 

intuition  about  what  the  frame  axiom  was  supposed  to  mean, 

If  the  initial  situation  is  fully  specified,  and  all  the  actions  are  deterministic, 
then  applying  the  frame  axiom  to  the  actions  in  order  of  their  occurence  will 
guarantee  that  each  situation  is  fully  described  when  the  frame  axiom  is  applied  to  it. 
Only  then  will  the  universal  frame  axiom  produce  correct  results.  Suppose  instead  that 
we  use  frame  axioms  which  explicitly  require  that  the  relevant  knowledge  is  available 
-  such  as  the  room  axiom  above.  If  relevant  facts  about  a  situation  are  missing 
because  the  frame  axioms  have  not  yet  been  applied  to  the  step  that  produced  that 
situation,  then  these  frame  axioms  will  not  apply  to  that  situation.  They  will  also  not 

apply  if  the  relevant  facts  are  missing  for  some  other  reason - for  example,  because 

of  a  non-deterministic  action.  Thus  by  using  frame  axioms  which  explicitly  require  the 
relevant  knowledge,  we  can  drop  the  constraints  that  the  initial  situation  is  fully 
specified,  all  actions  are  deterministic,  and  the  frame  axiom  is  applied  to  the  steps  of 
a  plan  in  order  of  occurence.  These  are  really  only  c .'Dnsequences  of  a  general 
principle,  default  rules  cannot  be  applied  safely  unless  all  the  relevant  knowledge  is 
available 


4.3  Domain- Specific  Frame  Axioms 

People  usually  discuss  the  frame  problem  quite  abstractly,  with  little  attention 
to  concrete  plans.  This  has  gone  so  far  that  Hanks  and  McDermott  created  a  stir  by 
discussing  a  concrete  plan  of  three  s*eps.  in  a  domain  with  three  predicates!  Davis  [2] 
looked  at  a  less  trivial  domain - a  subset  of  classical  mechanics  He  found  that  no 


S 

cV 


S 


4-4 


Report  No,  6636 


BBN  Laboratories  Inc. 


frame  axioms  at  all  were  needed  He  had  to  assume,  m  each  example,  that  a  short  list 
of  objects  were  the  only  ones  involved  i.i  the  example  Given  that,  the  general  axioms 
about  force  and  motion  sufficed  to  prove  that  objects  remained  in  their  places  The 
present  paper  will  not  offer  a  general  discussion  of  domain-specific  frame  axioms, 
instead  we  consider  one  particular  problem  involving  frame  axioms  about  motion,  and 
we  illustrate  by  formalizing  a  simple  example  This  will  illustrate  our  approach  to 
writing  domain-specific  frame  axioms 

If  you  load  your  possessions  in  the  trunk  of  your  car  and  drive  it  across  town 
all  your  possessions  will  move  across  town,  but  they  will  remain  in  your  car,  How  can 
we  describe  this  situation  with  reasonable  frame  axioms'’  We  rely  on  the  distinction 
between  primitive  and  defined  predicates,  used  in  STRIPS  There  are  frame  axioms  only 
for  primitive  prediates,  to  prove  the  persistence  of  defined  predicates  we  reduce  them 
to  primitive  predicates  using  their  definitions,  (at  object  spot)  says  that  an  object  is 
resting  on  the  street  at  a  certain  spot,  (in  object  container!  says  that  an  object  is 
sealed  in  a  container  These  are  primitive  predicates,  (location  object  spot)  is  a 
defined  predicate,  it  holds  if  either  (at  object  spot)  or  there  is  an  object2  such  that 
(at  objects  spot)  and  (in  object  objects).  A  frame  axiom  says  that  driving  will  not 
delete  (in  of.jecl  car)  Then  if  driving  across  town  makes  (at  car  spotl)  true  while 
preserving  (in  object  car),  we  can  deduce  that  (location  object  spotS)  holds. 


In  general,  an  assembly  is  a  collection  of  objects  that  can  be  moved  as  a  group 
My  car  and  its  contents  form  an  assembly,  if  1  am  holding  an  object  then  1  and  that 
object  belong  to  an  assembly.  The  planet  Earth  and  all  its  contents  form  a  very  large 

assembly.  Primitive  predicates  describe  the  location  of  an  object  only  with  respect  to 

the  smallest  assembly  that  contains  that  object  To  describe  an  object  s  location  with 
respect  to  a  larger  assembly,  we  use  defined  predicates  Frame  axioms  a.ssert  that  in 
certain  cases  we  can  move  an  assembly  while  preserving  the  locatioru  of  its  parts  with 
the  respect  to  the  assembly.  Given  the  new  location  of  the  assembly  in  some  large 

scale  frame  of  reference,  we  can  infer  the  -lew  locations  of  its  parts  in  that  large 

scale  frame  of  relerence 


Consider  a  domain  based  on  the  STRIPS  domain  of  rooms  and  boxes.  The  domain 
contains  five  kinds  of  objects  boxes,  rooms,  doors,  spots,  and  the  robot  itself.  Spots 
are  pieces  of  floor  inside  the  rooms,  and  an  whenever  an  object  is  resting  on  the 
floor  it  IS  in  exactly  one  spot  The  robot  is  always  resting  on  the  floor,  but  the  boxes 
are  not.  A  box  may  rest  on  the  floor,  or  the  robot  may  be  holding  it,  but  not  both 


BBN  Laboratories  Inc. 


Report  No.  6636 


The  robot  can  hold  many  objects  at  once,  We  use  the  following  predicates,  with 


variables  marked  by  colons. 


(result  s  a  s')  Action  a  turns  situation  s  into  s'. 


(at  obj  sp  .s) 


Object  ohj  IS  at  spot  sp  in  situation  .s 


(next  .spl  .sp2)  The  spot  spl  is  next  to  the  spot  .sp2. 


(holding  b  .s)  The  robot  is  holding  box  b  in  situation  s 


(in  sp  r) 


Spot  sp  IS  in  room  r 


(connect  d  rl  .r2)  Door  d  connects  room  .rl  and  room  r2 


Only  two  predicates  have  situation  arguments,  because  only  these  predicates  can 
be  changed  by  actions.  The  robot  and  the  boxes  it  is  holding  form  an  assembly,  the 
primitive  predicate  "holding"  describes  the  location  of  a  box  with  respect  to  that 
assembly. 


The  actions  are  as  follows. 


(pickup  .b)  The  robot  picks  up  box  b. 


(putdown  b)  The  robot  puts  box  b  down 


(goto  b)  The  robot  goes  to  the  location  of  object  b  (which  must  be  in 
the  same  room  as  the  robot). 


(gothru  d)  The  robot  goes  through  door  ,d. 


We  assume  that  if  f  and  g  are  distinct  terms  and  denote  actions,  they  denote 
distinct  actions.  This  assumption  can  be  implemented  with  an  axiom  schema. 


Each  action  has  an  axiom  to  describe  the  changes  that  it  causes  when  executed 


with  the  correct  preconditions.  The  robot  can  pick  up  a  box  if  the  robot  is  at  a  spot 


next  to  the  box's  spot 


[  (at  b  spl  ,s)  &c  (at  Robot  :sp2  si  &  (next  spl  .sp2l 
be  (result  .s  (pickup  .b)  s')  ]  — ^  (holding  b  s') 


It  IS  not  neccesary  to  postulate  that  this  action  deletes  the  old  location  of  b. 
that  follows  because  an  object  cannot  be  in  a  spot  when  the  robot  is  holding  it. 


a 


Report  No.  6636 


BBN  Laboratories  Inc. 


The  robot  can  pul  down  any  box  that  it  is  holding 

[  (at  Robot  .sp  s)  &  (holding  b  .s)  Sc  (result  s  tpuldown  :b)  :s')  ] 
— >  (a  .sp'.  (at  .b  .sp'  s')  &  (next  .sp  .sp')) 


The  robot  can  go  to  a  spot  next  to  box  b  if  .b  is  in  the  same  room  as  the 


robot. 


[(at  .b  .spl  .s)  Sc  (at  Ro'uot  .sp2  s',  Sc  (in  spl  room)  '(c  (in  .sp2  room) 

Sc  (result  s  (goto  ;b)  s')  ] 

— (a  .sp3  (at  .^obot  sp3  s')  Sc  (next  .sp3  spl)) 

The  robot  can  go  from  room  to  room  through  a  door. 

[(at  Robot  .spl)  Sc  (in  spl  ;rooml)  Sc  (connect  door  rooml  :room2) 

&  (result  :s  (gothru  .door)  ;s')  J 
— ^  (3  sp2  .  '.at  Robot  .sp2  ;s  )  Sc  (in  .sp2  .room2)) 

For  each  primitive  predicate  toere  are  two  classes  of  frame  axioms  those  that 
limit  the  class  of  actions  that  add  the  predicate,  and  those  that  limit  the  class  of 
actions  that  delete  the  predicate.  For  the  predicate  "at",  we  consider  first  the 
movement  of  the  .-obot  Only  uoTo  and  GoThru  can  add  or  delete  "at"  for  the  robot. 

[(at  Robot  :sp  :s)  Sc  '♦(at  Robot  sp  s')  Sc  (result  s  act  .s')  ] 

— ^  [  t3  :b  :act  =  (GoTo  b))  V  (3  .d  .  act  =  (GoThru  .d))  ) 

[-•(at  Robot  ;sp  s)  Sc  (at  Robot  ;sp  ;s')  Sc  (result  .s  act  s')  ] 

— >  [  (3  b  .  :act  =  (GoTo  .b))  V  (3  d  .act  =  (GoThru  .d))  ] 

For  boxes,  only  Pickup  deletes  "at'  .  and  only  Putdown  adds  "at" 

[dsBox  b)  (at  b  sp  s)  Sc  ~(at  b  sp  .s')  Sc  (result  s  act  .s')  ] 

— ♦  act  =  (pickup  b) 

[dsBox  b)  Sc  ~(at  b  sp  s)  Sc  (at  b  sp  s')  6:  (result  .s  act  s')  J 
— >  .act  =  (putdown  b) 

Putdown  IS  the  only  action  that  deletes  holding  ".  and  Pickup  is  the  only  action 
that  adds  "holding'  . 


BBN  Laboratories  Inc. 


Report  No.  6636 


[(holdir  o  b  s)  &  ~(holding  b  .s  )  &  (result  s  act  s')] 

— ?  act  =  (putdown  b) 

["(holding  b  .s)  6c  (holding  b  .s')  &  (result  .s  .act  ,s')] 

— ^  .act  =  (pickup  .b) 

These  axioms  allow  us  to  prove,  for  example,  that  when  the  robot  travehs  from 

one  room  to  another  the  set  of  boxes  it  is  holding  does  not  change.  Suppose  (InRoom 

.b  r)  holds  when  box  .b  is  in  room  r  This  predicate  cannot  be  primitive  because  it 
describes  the  location  of  a  box  with  respect  to  the  large  scale  frame  of  reference, 
even  when  that  box  is  being  held  by  the  robot  and  so  forms  part  of  an  assembly.  By 
contrast,  the  primitive  predicate  af  describes  a  box  s  location  with  respect  to  a 
large  scale  frame  of  reference,  but  only  when  the  box  is  not  part  of  an  assembly  - 
that  IS,  the  robot  is  not  holding  the  box.  The  predicate  ■  InRoom  "  can  of  course  be 
defined  -  (InRoom  .b  :r  :s)  holds  iff 

(a  :sp  .  (in  .sp  .r)  & 

(  (at  b  sp  :s)  V  (  (at  Robot  sp  s)  &  (holding  b  ;s)  ]  ] 

) 

One  can  then  prove  that  if  the  robot  goes  to  room  A  while  holding  block  Bl.  the 

robot  will  still  be  holding  Bl  and  so  Bl  will  be  in  room  A. 

The  notion  of  an  assembly  is  certainly  not  domain-independent  in  the  sense  that 
it  applies  to  any  domain  where  actions  are  relations  over  situations.  Yet  it  does  apply 

to  a  rich  variety  of  situation  in  everyday  life - from  carrying  a  cup  of  coffee 

across  the  room  to  flying  a  plane  across  country  We  argue  that  this  kind  of  '  domain 
independence"  is  more  feasible  and  more  useful  than  the  kind  of  domain  independence 
that  the  universal  frame  axiom  was  supposed  to  have  There  may  be  many  ideas 
which,  like  the  notion  of  an  assembly,  are  useful  in  a  wide  variety  of  everyday 
situations  Planning  research  should  aim  to  discover  ideas  like  these,  rather  than 
ideas  that  apply  to  any  domain  containing  situations  and  actions  Peo',  le  who  build 
theorem  provers  are  now  less  interested  in  universal  first-order  theorem  provers  and 
more  concerned  with  programs  designed  to  reason  about  crucial  concepts  like  time  or 
belief  People  who  study  the  frame  problem  should  likewise  look  not  for  a  universal 
solution,  but  for  ideas  that  apply  often  in  the  world  of  common  sense  and  everyday 
life 


I 


r 


M 


wvi'. 


Report  No.  6636 


BBN  Laboratories  Inc. 


References 


1.  Chapman,  David.  Nonlinear  Planning.  A  Rigorous  Reconstruction.  Proceedings  of 
the  Ninth  International  Joint  Conference  on  Artificial  Intelligence.  IJCAl,  Los  Angeles, 
California.  August.  1985.  pp.  1032-102-1 


2.  Davis.  Ernest.  A  Logical  Framework  for  Solid  Object  Physics.  245.  Department  of 
Computer  Science.  New  York  University.  October.  1986. 


3.  Fikes.  R  .  and  Nilsson.  N.J  "STRIPS  A  new  approach  to  the  application  of  theorem 
proving  to  problem  solving".  Artificial  Intclhgcncc  2  (1971).  189-208. 


4.  Green.  C.  Application  of  theorem-proving  techniques  to  problem-solving. 
Proceedings  of  the  international  joint  conference  on  artificial  intelligence.  Washington. 
D.  C..  May.  1969, 


5.  Hanks.  Steve  and  McDermott.  Drew.  Default  Reasoning,  Non-Monotonic  Reasoning, 
and  the  Frame  Problem.  Procedings  of  AAAl-86,  American  Association  for  Artificial 
Intelligence,  1986,  pp.  328-333. 


6.  Kautz,  H.A.  The  Logic  of  Persistence.  Proceedings  of  AAAl-86,  the  5th  National 
Conference  on  Artificial  Intelligence,  American  Association  of  ArtiLcial  Intelligence, 
August,  1986.  pp  401-405. 


7.  McCarthy,  J.,  and  Hayes,  P  J  Some  Philosophical  Problems  from  the  Standpoint  of 
Artificial  Intelligence  In  Machine  intelligence  4.  B  Meltzer  &  D.  Michie,  Eds.,  American 
Elsevier,  New  York,  1969. 


8.  Sacerdoli,  E  D.  4  StructuTe  for  Plans  and  Behavior.  American  Elsevier,  New 
York,  1977. 


4-9 


S' 


3 


4-10  pp; 


Report  No.  6636 


BBN  Laboratories  Inc. 


5.  A  COMPOSITIONAL  SEMANTICS  FOR  DIRECTIONAL  MODIFIERS 

Erhard  W.  Hinricbs 


Abstract 

Thi.?  paper  presents  a  model-theoretic  semantics  for  directional  modifiers  in 
English,  The  semantic  theory  presupposed  for  the  analysis  is  that  of 
Montague  Grammar  (cf.  Montague  1970.  1973)  which  makes  it  possible  to 
develop  a  strongly  compositional  treatment  of  directional  modifiers  Such  a 
treatment  has  significant  computational  advantages  over  case-based 
treatments  of  directional  modifiers  that  are  advocated  in  the  Al  literature. 

5.1  Case-based  Treatinents 

Among  natural  language  processing  systems  which  attempt  to  incorporate  spatial 
information,  the  following  strategy  seems  to  prevail  Directional  or  locative  modifiers 
are  treated  either  as  corresponding  slots  in  c-^sr  frames  in  the  canonical  lexical 
representations  of  verbs  (cf.  Celce  1972.  Hendrix.  Thompson  and  Slocum  1973).  or  as 
corresponding  to  conceptual  cases  in  the  (meta-linguistic)  conceptualization  of  actions 
fSchank  1975) 

Case  based  approaches  to  the  semantics  of  directional  modifiers  can  be 
characterized  as  ‘weakly  compositional  in  the  following  sense  In  a  verb  phrase  such 
as  fly  to  Chicano  the  prepositional  phrase  contribute.';  semantically  the  meaning  of  the 
NP  Chicago  as  the  directional  or  locative  goal  of  the  action  a.ssociated  with  the  verb 
phrase.  However  the  directional  preposition  to  itself  does  not  make  a  semantic 
contribution  at  all  to  the  meaning  of  the  ve''b  phrase  as  a  whole.  Instead,  to  merely 
serves  as  a  syntactic  marker  for  a  semantic  entitv.  namely  locative  or  directional  case 
whose  meaning  cannot  be  separated  from,  but  rather  is  an  integral  part  of  a  given 
verb  frame  or  conceptual  structure  By  contrast,  the  semantics  of  directional 
modifiers  that  1  will  be  advocating  in  this  paper  is  strongly  compositional  in  the  sense 
that  directional  prepositions  serve  as  autonomous  syntactic  and  semantic  units. 


BBN  Laboratories  Inc. 


Report  No.  6636 


Consequently,  each  word  in  a  phrase  such  as  fly  io  Chicago  contributes  its  own. 
independent  meaning  to  the  meaning  of  the  phrase  as  a  whole. 


This  strongly  compositional  analysis  of  directional  modifiers  has  a  number  of 
crucial  computational  advantages  over  case-based  approaches  Consider  how 
inferences  between  sentences  such  as  (1)  and  (2)  can  be  handled  by  the  two  types  of 
approaches 

(1,1  John  went  to  New  York, 

(2)  John  wa,-  in  New  York 


In  Schank  (1975.  p  53)  sentence  (1)  corresponds  to  the  conceptual  structure  in  (3) 


(3) 


John<^:^PTRANS4^ 


John 


New 


York 


(3)  be  should  read  as  John  is  at  some  time  in  the  past  (p)  engaged  in  an  act  of 
physical  transfer  (PTRANS)  whose  object  (o)  is  John  and  whose  direction  (D)  is  from 
some  location  X  to  New  York.  The  fact  that  (1)  implies  (2)  is  expressed  by  attaching 
to  the  bi-directional  arrow  in  (3)  the  structure  in  (4),  (cf,  Schank  1975,  p  54) 


(4) 


^OC(N.Y  ) 


Schank  calls  the  r-lirtk  (r  for  resvlt)  between  structures  (3)  and  (4)  an  inference 
However,  the  term  inference  is  really  a  misnomer  because  the  association  between 
structures  such  as  (3)  and  (4)  is  merely  a  matter  of  stipulation  but  does  not  follow 
from  any  general  principles  or  axioms  that  would  constrain  the  language  of  conceptual 
structures  For  that  matter,  there  is  nothing  in  Schenk's  system  that  prevents  a  link 
between  (3l  and  a  structure  which  expresses  that  John  does  not  reach  the  location 
A’cu  York.  In  the  analysis  we  will  develoji  below  on  the  othei-  hand  the  inference 
between  (1)  and  (2)  follows  logically  from  the  semantics  of  motion  verbs  such  as  go  in 
conjunction  with  the  semantics  of  directional  modifiers 


Consider  next  the  issue  of  how  easy  or  difficult  it  is  to  upscale  natural  language 
systems  whose  treatment  of  directional  modifiers  is  case- based  Assume  a  case-based 
system  in  which  only  those  verbal  frames  or  conceptual  structures  are  implemented 
that  relate  locative  or  directional  case  to  verbs  of  motion  Now  imagine  that  we  want 
to  extend  coverage  to  verbs  such  as  vai'c  which,  as  illustrated  in  (5l,  allow  directional 
modifiers  such  as  lo. 


s 


H  L> 


5 


s 


I 


is 


i.n 


.7 


5-2 


r? 


Report  No,  6636 


BBN  Laboratories  Inc 


I 


t 


s 


i 


I 


A' 

ifiC 


JV. 


(5)  The  President  waved  to  the  reporters 

Since  wave,  unlike  verbs  of  motion,  does  not  entail  a  change  of  location  for  the  agent 
involved,  a  new  verbal  frame  or  conceptual  structure  would  have  to  be  introduced  into 
a  system  which  only  covers  motion  verbs  Moreover,  locative  or  directional  case  would 
have  to  be  reintroduced  into  the  system  as  well  because  in  a  case-based  system  the 
specific  effect  of  a  given  semantic  case  has  to  be  determined  for  each  individual  frame 
or  conceptual  structure  This  is  a  direct  consequence  of  the  weakly  compositional 
semantics  of  such  systems  and  in  turn  leads  to  an  highly  redundant  method  of 
upscaling  Since  our  analysis  of  directional  modifiers  is,  by  contrast,  strongly 
compositional,  upscaling  becomes  much  easier.  In  the  case  of  extending  coverage  to  a 
verb  like  wave,  all  that  needs  to  be  added  is  the  lexic  1  semantics  for  the  verb  itself, 
while  the  semantics  of  directional  modifiers  can  remain  untouched. 

Finally,  consider  how  a  case-based  approach  to  directional  modifiers  fares  with 
respect  to  phrases  such  as  the  ones  given  in  (6) 

(6)  From  Russia  with  Love 

To  New  York  and  then  to  Atlanta 

Since  in  case-based  systems  locative  or  directional  case  is  a  relational  notion  and  is 
crucially  dependent  on  a  verbal  frame  or  conceptual  structure,  it  becomes  impossible 
to  assign  an  interpretation  to  verbless  phrases  as  in  (6),  One  strategy  for  extending 
case-based  systems  to  such  verbless  phrases  would  consist  in  supplementing  the 
relational  notion  of  directional  or  locative  case  by  a  non-relational  counterpart  which 
does  not  depend  on  some  verbal  frame  or  conceptual  structure.  But  the  resulting 
account  of  locative  or  directional  case  would  once  again  be  highly  redundant  since 
essentially  all  of  the  cases  in  the  system  would  have  to  be  split  into  a  relational  and 
a  non-relational  version 


5,2  Motion  Verbs  as  Location  Predicates 

In  their  literal  sense,  locative  use  to  and  toward  typically  modify  motion  verbs 
such  as  walk.  rvn.  drive,  slither,  more  etc  An  adequate  treatment  of  the  directional 
modifiers  themselves  is,  therefore,  closely  connected  to  a  semantic  account  of  such 
motion  verbs  The  treatment  of  motion  verbs  that  1  will  adopt  in  this  paper  is  that 
developed  in  Hinnchs  (19851  where  I  argue  that  motion  verbs  should  be  treated  as 


5-3 


BBN  Laboratories  Inc. 


Report  No.  6636 


stage  level  predicates  in  the  sense  of  Carlson  (1977),  namely  as  predicates  whose 
arguments  refer  to  stages  of  individuals.  Stages  are  connected  to  individuals  in 
Carlson's  ontology  by  a  realization  relation  R,  which  associates  a  given  individual  with 
all  of  the  (spatio-temporal)  stages  at  which  that  individual  is  present. 


Motion  verbs  such  as  move  can  be  understood  as  prototypical  examples  of  stage- 
level  predicates,  since  such  verbs  predicate  something  about  the  spatio-temporal 
location  of  one  or  more  objects.  Following  Hinrichs  (1985),  1  interpret  a  motion  verb 
like  move  in  terms  of  a  three-place  stage  level  predicate  move'*^.  whose  first  two 
argument  positions  range  over  individual  stages  realizing  the  referents  of  the  object 
and  subject  NPs.  respectively.  The  rightmost  argument  position  ranges  over  event 
stages  realizing  the  event  that  the  referents  of  the  subject  and  object  NPs  are 
engaged  in.  Thus.  movc'^(x^){y^)(c^)  should  be  read  as.  "the  referents  of  and 
are  engaged  in  an  event  stage  c®  realizing  an  event  of  moving."  As  is  customary  in 
Montague  Grammar,  we  express  constraints  on  lexical  meaning  in  terms  of  meaning 
postulates  that  constrain  the  set  of  possible  models  of  semantic  interpretation  ’  The 
meaning  postulate  in  (7)  states  that  an  event  stage  c®  which  realizes  a  moving  event 
spatio-temporally  includes  (symbolized  as  :_)  at  least  the  location  of  the  referent 
denoted  by  the  object  argument,  i.e.  y®  <  e®  This  does  not  exclude  the  possibility 
that  the  location  of  the  referent  of  the  subject  NP  can  be  contained  in  the  event 
stage  as  well,  but  this  is  not  required  for  wioi'c  in  view  of  examples  like  (8) 

(7)  Y  x®.y®,e®  [  move''’(x®)(y®)(e®)  y®  <  e®] 

(8)  John  moved  the  troops. 

Of  course,  different  motion  verbs  will  have  different  properties  with  respect  to  the 
relative  locations  of  event  stages  and  those  stages  that  realize  the  individuals 
involved  in  these  event  stages.  Take  verbs  like  slither,  valk.  and  run.  for  example 
which  in  our  frameworl;  are  analyzed  as  two-place  stage  level  predicates  For  these 
predicates  it  seems  reasonable  to  simply  equate  the  location  of  the  event  stage  with 
the  location  of  the  agent,  i  e  the  referent  of  the  subject  NP  This  can  be  enforced  by 
a  meaning  postulate  as  in  (9) 

(9)  Y  x®.e®  [t'''(x®)(e®)  — >  vS  =  e®  ],  where  f  translates  slither.  li'aUc.  run,  etc.. 

^Al  I  the  meaning  postulates  appearing  in  this  paper  are  formulated  in  the  language  of 
extensianal  logic  developed  in  Hinrichs  (1985). 

5-4 


Report  No.  663.6 


BBN  Laboratories  Inc. 


The  lexical  entailment  associated  with  the  verb  move  to  the  effect  that  the  location  of 
the  referent  of  the  object  NP  changes  can  be  captured  by  the  meaning  postulate  in 
(10).  (The  symbols  <  and  used  in  (10)  stand  for  temporal  precedence  and  spatial 
inequality,  respectively  ) 


(10)  V  e®,x=,y®,x°  tR(x®.x°)  &  move+(xf )(.,’f )(ef )  —  3  x|  [^(Xj.x®)  &  >:® 


x|  & 


5.3  The  Semantics  of  to  and  toward 


ru.'t 


i 


I 

I 


Vj 


',Tl 

St" 

SJi 


S?'' 

s 


L 


tnj 

V 


Now  we  let  a  (o-phrase.  as  a  modifier  of  untensed  verb  phrases  (IV*),  operate 
semantically  on  the  event  stages  in  the  denotation  of  the  unmodified  verb  phrase  in 
such  way  that  the  event  stages  in  the  denotation  oi  the  resulting  IV*  phrase 
constitute  a  spatio-temporal  path  (in  the  sense  of  Cresswell  1978)  between  some 
specified  point  of  origin  to  the  location  of  the  term  combining  with  to.  The  translation 
of  to  is  given  in  (11). 

(11)  to  translates  as  XPXPXl^Xx'p  [Xy'SljlRdj-y Ac  PATH(1, .l^.lj)  &  P(x’)(l,)]] 

Ignoring  for  the  time  being  the  battery  of  lambda-abstractions  at  the  beginning  of  the 
formula,  which  make  the  translation  of  to  come  out  to  be  of  the  right  semantic  type, 
the  formula  following  the  lambda  abstractions  introduces  an  individual  stage 
realizing  an  individual  object  y^.  which  is  the  one  bound  by  the  noun  phrase  (NP) 
combining  with  to  to  form  the  IV*  modifier.  The  second  conjunct  in  the  formula 
asserts  that  the  denotation  of  the  event  stage  located  at  Ij.  which  is  to  be  bound  by 
the  t.ran.slation  of  the  IV*  phrase  that  the  fo-phrase  combines  with,  qualifies  as  a 
spatio-tcvipora-  path  (a  notion  formally  defined  in  Hinrichs  1965)  between  some  point 
of  origin  Ij.  and  the  spatio-temporal  location  of  the  point  of  destination.  Finally,  the 
third  conjunct  asserts  the  truth  of  the  unmodified  IV*  phrase  that  the  the  (o-pnrase 
combines  with  It  is  this  last  conjunct  that  automatically  guarantees  the  inference 
from  sentences  such  as  (IS)  to  sentences  such  as  (13) 

(12)  Fangs  slithered  to  the  rock 

(13)  Fangs  slithered 

Using  the  translation  fer  to  suggested  in  tilt  sentence  (I2l  receives  the  reduced 
translation  in  (14)  according  to  our  analysis 

Paraphrasing  (14),  it  says  that  there  is  an  event  stage  realizing  some  individual  event 


w 


5-5 


3 


BBN  Laboratories  Inc. 


Report  No.  6636 


;i4)  3e®.e'  [R(e®,e')  &  PAST(e®)  &  3x®  [R(x®.f)  &  3x°V2°  [  rock  (2°)  k  32® 

[R(2®.2°)  &  shther’(x®)(e®)  &  PATH(e®,l^,2®)]  — r-  x°  =  2°]] 

of  Fangs'  slithering  such  that  that  event  stage  lies  in  the  past  and  the  spatio- 
temporal  location  of  the  event  stage  constitutes  a  path  between  some  implicit  point  of 
reference  and  the  location  of  some  unique  rocK  object  The  point  of  reference 
occurs  as  a  free  variable  in  the  formula  in  (141.  is  to  be  understood  as  an  indexical 
parameter  similar  to  the  notion  of  a  reference  point  proposed  by  Reichenbach  (1947) 
for  the  interpretation  of  tenses  in  English 

Notice  that  the  notion  of  a  path  in  the  translation  of  to  in  (11)  and  hence  also 
in  the  translation  for  (12)  given  in  (14)  is  defined  to  hold  of  the  process  making  up  a 
particular  event.  Moreover,  due  to  the  postulate  in  (9),  the  referent  of  the  subject 
NP.  when  it  combines  with  a  motion  verb  such  as  slither  to  the  rock,  is  reali2ed  by  a 
stage  spatio-temporally  co-extensive  to  the  path  denoted  by  the  to  phrase.  This  fact 
guarantees  the  inference  between  sentences  such  as  (12)  and  (15). 

(15)  Fangs  was  at  the  rock. 

For  other  classes  of  verbs  the  same  type  of  inference,  namely  identifying  the  path  with 
the  position(s)  of  the  referent  of  the  subject  NP.  cannot  be  drawn.  For  sentences 
such  as  (16)  we  do  not  want  to  claim  that  the  stages  reali2ing  John  make  up  a  path  to 
Boston.  Rather  it  is  the  object  NP.  in  this  case  an  event  term,  that  constitutes  the 
path.  The  same  is  true  of  (17).  it  is  the  ball  whose  locations  constitute  a  path  to  the 
location  specified  in  the  (o-phrase 

(16)  John  made  a  phone  call  to  Boston 

(17)  Carol  set  the  ball  to  Lucy. 

Let  us  now  turn  to  the  treatment  of  the  preposition  toward  whose  lexical 
translation  rule  is  given  in  (18) 

(18i  toward  translates  as  Xf’XpXe®X>; ‘P(Xy '  31  (Rd.y'i  ic  31  (PATHd  l^.Di  X:  e®  ■ 
r  1^  •  e®  &  P(x')(e®(]]) 

The  translation  for  toward  constrains  the  value  of  the  event  stage  variable  (to  be 
bound  by  the  stage-level  predicate  of  the  IV*  with  which  the  toward-phrase 
combines)  The  value  for  e^  has  to  be  spatio-temporelly  contained  in  some  initial 
segment  of  a  path  /’  from  some  implicit  point  of  origin  to  the  location  /  of  the 

referent  of  the  NP  with  which  toward  The  requirement  that  the  value  of  c®  has  to  be 

5-6 


Report  No.  6636 


BBN  Laboratories  Inc. 


an  initial  segment  of  such  a  path  follows  from  the  condition  that  the  implicit  point  of 
origin  1^  has  to  be  properly  contained  in  c^.  Proper  containment  is  necessary  in 

order  to  avoid  that  the  value  of  could  be  equal  to  the  point  of  origin,  in  which 

case  an  object  could  count  as  moving  toward  another  object  if  the  spatial  location  of 
the  first  object  remains  unchanged. 

Let  us  demonstrate  how  the  rule  in  fl8)  applies  in  the  translation  of  sentence 

(19)  that  IS  given  in  (20) 

(19)  Fangs  slithered  toward  the  rod; 

(20)  3e®,e'  [R(e®.e')  &  PAST(e®)  k  3:c®  [R(>:®,f)  k  3x°  [Vz®  [rod;  (z®)  x®  =  z®] 
k  3z*  [R(z®,z®)  k  slilher'(x®,e®)  k  31  (PATHd.l^.z®)  c'.:  e®  <  1  &;  1^  <  e®]]]]] 

T^'e  translation  in  (20)  says  that  there  is  an  event  stage  realizing  some  individual 

evv.-nt  of  Fangs’  slithering  such  that  that  event  stage  lies  in  the  past  and  the  spatio- 

temporal  location  of  the  event  stage  constitutes  the  initial  part  of  a  path  between 
some  implicit  point  of  reference  /  .  and  the  location  of  some  unique  rock  object.  Since 
e®  in  (20)  IS  an  initial  part  of  a  complete  path  to  the  rock,  the  truth  of  a  sentence 
such  as  (12)  entails  the  truth  of  (19).  but  not  vice  verse  Moreover.  (12).  but  not 
(19).  entails  (151 

5.4  The  Aspectual  Effect  of  to  and  toward 

Apart  from  accounting  for  the  relevant  inference  patterns  between  sentences 
such  as  (12).  <15)  and  (19),  an  adequate  analysis  of  to  and  toward  should  also  for  a 
systematic  difference  in  the  aspectual  behavior  of  these  two  directional  modifiers. 
Sentences  such  as  (21a  I  whic.h  involve  the  prejjosiuon  to  describe  ateiie  ei  cnto  or.  in 
the  terminology  of  Vendler  (196TI,  ocfrci/ics  Sentences  such  as  (2lb),  on  the  other 
hand  refer  to  telic  events  or  to  accoviphshmnit^  in  the  Vendler  classification 

(21)  a  John  walked  to  the  library 

b.  John  walked  toward  the  library 

These  aspectual  properties  can  be  demonstrated  by  examining  the  cooccurrence 
restrictions  of  the  sentences  in  (21)  with  temporal  modifiers  such  as  in  an  hour  as  in 

(22)  and  with  for  an  hour  as  in  (23) 

(22)  a.  John  walked  to  the  library  in  an  hour 

b.  *  John  walked)  toward  the  library  in  an  hour 


BBN  Laboratories  Inc. 


Report  No.  6636 


(23)  a.  John  walked  to  the  library  for  an  hour 

b.  John  walked  toward  the  library  for  an  hour. 

As  first  pointed  out  by  Vendler.  only  telic  events  or  accomplishments  can  occur  with 
temporal  modifiers  such  as  in  an  hour.  Modifiers  such  as  for  an  hour  can  occur  with 
both  activities  and  accomplishments  However,  when  modified  hy  temporal  for.  only 
activities  as  in  (23a)  can  be  interpreted  as  describing  a  single  event.  If  temporal  for 
occurs  with  sentences  that  describe  accomplishments  as  in  (23b).  such  sentences  have 
to  be  interpreted  in  some  special  fashion  to  make  them  semantically  acceptable 
(23hi.  for  example,  con  best  be  understood  as  referring  to  an  iterative  event,  namely 
of  John  s  repeatedly  walking  to  the  library  during  the  period  of  one  houi . 

Since  doing  something  for  x  amount  of  time  means  doing  something  during  most  if 
not  all  subintervals  of  the  interval  x,  sentences  such  as  (24),  which  refer  to  atelic 
events  or  activities,  can  be  characterized  as  being  temporally  homogeneous 

(24)  Fangs  slithered  toward  the  rock 

To  do  something  in  x  amount  of  time,  on  the  other  hand,  means  to  do  something  at 
some  unique  interval  wi*hin  x  Since  telic  events  or  accomplishments  can  be  modified 
by  temporal  in,  they,  in  contrast  to  activities  or  atelic  events,  can  be  described  as 
being  temporally  heterogeneous  telic  events  such  as  (25)  come  about  over  the  course 
of  some  unique  time  interval  1  .  i.e  not  at  some  subinterval  of  1  or  at  some  interval 
properly  containing  1. 

(25)  Fangs  slithered  to  the  rock 

If  my  analysis  of  directional  toward  and  to  is  an  adequate  one.  it  should  predict 
that  verb  phrases  formed  with  directional  toward  refer  to  temporally  homogeneous 
events,  while  verb  phrases  formed  with  to  refer  to  lernpui  ally  heterogenous  events 
Due  to  the  way  in  which  1  have  defined  toward  as  an  initial  subpart  of  a  path  to  the 
projected  point  of  destination,  the  reference  property  of  temporal  homogeiieitv 
associated  with  toward  can.  in  fact,  be  reconstructed  in  the  following  way.  Let  us 
assume  that  there  is  a  location  which  qualifies  as  an  initial  segment  of  a  path  from 
a  putative  point  of  origin  r^  to  a  destination  d  Moreover,  let  us  assume  that  ro.  the 
temporally  final  bound  of  Ij.  is  in  turn  the  temporallv  initial  bound  for  a  location 
which  forms  the  intial  part  of  a  path  from  to  d  Then  it  follows  that  ij  v  1.-,  is  an 
initial  segment  of  a  path  from  to  d  This  is  precisely  what  is  required  to  make  the 
semantics  of  toward  homogeneous 


Report  No.  6636 


BBN  Laboratories  Inc. 


Since  our  account  of  motion  verbs  and  directional  toward  does  predict  that 
sentences  such  as  (4)  correspond  to  atelic  and  semantically  homogeneous  events,  our 
analysis  can  support  inferences  from  sentences  such  as  (26)  to  sentences  such  as 
(2?) 

(26)  United  Flight  342  has  moved  toward  Logan  Airport  for  the  last  fifteen 
minutes. 

(2?)  United  Flight  342  moved  toward  Logan  Airport  ten  minutes  ago. 

Inference  patterns  between  sentences  such  as  (26)  and  (27)  are,  in  fact,  highly 
relevant  for  data  t)ase  interface  svstems  that  process  spatial  information.  Imagine 
that  sentence  (26)  is  presented  to  a  database  that  monitors  plane  movements.  If  the 
system  does  not  have  the  capability  to  infer  that  the  event  described  in  (26)  is  true 
at  any  subinterval  of  the  fifteen  minutes  mentioned  in  (26).  the  United  flight  in 
question  would  erroneously  not  be  counted  when  the  answer  to  a  subsequent  query 
such  as  (28)  is  computed. 

(28)  How  many  planes  moved  toward  Logan  Airport  ten  minutes  ago? 

If  we  compare  the  semantics  of  toward,  as  we  have  defined  it  in  (18)  above  with 
the  semantics  of  to  as  defined  in  (11).  it  turns  out  that  to  is  heterogeneous  in  its 
reference  in  the  same  way  as  accomplishments.  Recall  that  the  semantics  of  to  is 
defined  in  terms  of  a  complete  path  between  a  point  of  origin  and  a  point  of 
destination.  Since  for  any  given  path  there  do  not  exist  any  sublocations  within  that 
path  that  themselves  would  qualify  as  a  path  between  the  same  two  locations,  the 
heterogeneous  reference  property  of  to  follows  automatically. 


5.5  Conclusion 

In  order  to  make  an  even  stronger  case  in  favor  of  my  analysis  of  directional 
modifiers.  1  would  have  to  demonstrate  how  it  can  be  generalized  to  locative 
prepositions  other  than  to  and  toward  Even  though  we  cannot  discuss  this  issue  in 
detail  in  the  present  paper.  1  should  like  to  point  out  in  conclusion  that  the  notion  of 
a  PATH  plays  an  important  role  in  the  treatment  of  other  directional  prepositions  such 
as  between,  along  and  across  In  the  case  of  across  the  path  seems  to  be  bounded  by 
two  locations  on  the  peripherie  of  the  referent  of  the  NP  across  is  combined  with,  i  e 
across  the  meadow  specifies  some  path  extending  from  one  end  of  the  meadow  to  the 


5-9 


BBN  Laboratories  Inc. 


Report  No.  6636 


other  Notice,  however,  that  the  two  locations  that  mark  the  two  endpoints  of  such  a 
path  cannot  be  chosen  arbitrarily  but  in  some  sense  have  to  be  "opposite  each  other" 
Undoubtedly,  various  pragmatic  considerations  enter  the  picture  if  one  w^nts  to  make 
this  requirement  of  oppositcncss  formally  more  precise  Thus,  it  appears  that  the 
notion  of  a  path  has  to  be  complemented  by  additional  constraints,  if  one  wants  to 
account  for  semantically  more  complex  prepositions  such  as  across  Even  though  we 
will  have  to  leave  the  formulation  of  such  additional  constraints  to  future  research,  it 
should  be  obvious  from  these  brief  remarks  that  the  notion  of  a  path  is  a  central 
notion  foi  the  semantics  of  directional  modifiers  in  general. 


Report  No.  6636 


BBN  Laboratories  Inc. 


REFERENCES 


Carlson,  Gregory  N  (1977).  Reference  ^  Kinds  m  English.  University  of 
Massachusetts  dissertation 

Celce- Murcia.  M  (1972)  Paradigms  for  Sentence  Recognition  Technical  Report 
HRT- 15092  790'  System  Development  Corporation  Santa  Monica,  CA 

Cresswell.  Maxwell  (1978)  Prepositions  and  Points  of  View  Linguistics  and 
Philosophy  Vol  2  1 .  pp  1-41 

Davidson.  Donald  (1967)  The  Logical  Form  of  Action  Sentences  In  Rescher, 

Nicholas  ed.  The  Logic  of  Decision  and  Action  Pittsburgh  University  of 
Pittsburgh  P;ess.  pp  81-95 

Gunji.  Te.hao  and  Norm  Sondheimer  (1980)  ’The  Mutual  Relevance  of  Model-theoretic 
Semantics  and  Artificial  Intelligence  SMIL  Tl^  Journal  of  the  Linguistic 
Calculus  Vol  3,  pp  5-42. 

Hendrix.  Gary,  Craig  Thompson,  -nd  Jonathan  Slocun  (1973).  'Language  Processing  via 
Canonical  Verbs  and  Semantic  Models'  Proceedings  of  lJCAl-73 

Hinnchs,  Erhard  (1985)  A  Compositional  Semantics  for  Aktionsarten  and  NP  Reference 
in  English  Ph  D  dissertation.  Ohio  Siate  University. 

Montague,  Richard  (1970)  'Universal  Grammar  Theoria  36,  pp  373-398 

Montague.  Richard  (1973).  'The  Proper  Treatment  of  Quantification  in  Ordinary 

English'  In.  Hintihka,  J.,  J  Moravcsik,  and  P.  Suppes  eds.  Approaches  t^  Natural 
Language.  Reidel.  Dordrecht. 

Reichenbach.  Hans  (1947).  Elements  of  Symbolic  Logic  Berkeley  University  of 
California  Press 

Schank.  Roger  (19751  Conceptual  Information  Proc essing  North- Holland  New  York 

Vendler.  Zeno  (1967)  Linguistics  in  F'hilosophv  Ithaca  Cornell  Universitv  Press 


BBN  Laboratories  Inc. 


Report  No.  6636 


Report  No.  6636 


BBN  Laboratories  Inc 


I 


I 

I 


I 


I 

§ 


V 

kK 


n 

I 

STS 


6.  REFERENCE  AND  REFERENCE  FAILURES 


Bradley  A.  Goodman 
Abstract 

The  goal  of  this  work  is  the  enrichment  of  human-machine  inleractjons  in  a 
natural  language  environment.  Because  a  speaker  and  listener  cannot  be 
assured  to  have  the  same  beliefs,  contexts,  perceptions,  backgrounds,  or 
goals  at  each  point  in  a  conversation,  difficulties  and  mistakes  arise  when  a 
listener  interprets  a  speaker's  utterance  These  mistakes  can  lead  to  various 
kinds  of  misunderstandings  between  speaker  and  listener,  including  reference 
failures  or  failure  to  understand  the  speaker's  intention  We  call  these 
misunderstandings  miscomm-wnication  Such  mistakes  can  slow,  and  possibly 
break  down,  communication.  Our  goal  is  to  recognize  and  isolate  such 
miscommunications  and  circumvent  them.  This  paper  highlights  a  particular 
class  of  miscommunication  -  reference  problems  -  by  describing  a  case  study 
and  techniques  for  avoiding  failures  of  reference.  W‘e  want  to  illustrate  a 
framework  less  restrictive  than  earlier  ones  by  allowing  a  speaker  leeway  in 
forming  an  utterance  about  a  task  and  in  determining  the  conversational 
vehicle  to  deliver  it.  This  paper  also  promotes  a  new  view  for  extensional 
reference. 


6.1  Introduction 

Reference  in  the  real  world  differs  greatly  from  the  reference  processes  modelled 
in  current  natural  language  systems.  A  speaker  in  the  real  world  is  a  rational  agent 
who  must  make  a  decision  about  his  description  in  a  limited  time,  with  limited 
resources,  knowledge,  and  abilities.  In  particular,  the  speaker's  perceptual  and 
communicativs  skills  are  imperfect  or  his  model  of  the  listener  is  erroneous  or 
incomplete  Additionally,  a  speaker  can  also  be  sloppy  in  his  description  Since  the 
speaker  s  goal  in  the  reference  process  is  to  ronstrud  a  descrijition  that  "works"  for 
the  listener,  the  listener,  from  his  viewpoint,  must  take  these  imperfections  into 
account  when  trying  to  interpret  the  speaker  s  utterances.  Yet,  listeners,  too.  have 
imperfect  perceptual  or  communicative  skills  and  can  be  sloppy.  Hence,  they  must  be 
prepared  to  deal  with  their  own  imperfections  when  performing  reference 
identification.  In  real  reference.  listener's  often  recover  from  initial 

misunderstandings  with  or  without  help  from  the  speaker  Natural  language 
understanding  systems  must  do  this,  too  Therefore,  in  performing  the  reference 
process,  a  system  should  assume  and  expect  problems 

6-1 


BBN  Laboratories  Inc. 


Report  No.  6636 


The  focus  of  my  work  m  [3.  4.  5]  was  to  study  how  one  could  build  robust 
natural  language  processing  systems  that  can  detect  and  recover  from 
miscommunication.  1  investigated  how  people  communicate  and  how  they  recover  from 
problems  in  communication.  That  investigation  centered  on  reference  problems, 
problems  a  listener  has  determining  whom  or  what  a  speaker  is  talking  about.  A 
collection  of  protocols  of  a  speaker  explaining  to  a  listener  how  to  assemble  a  toy 
water  pump  were  studied  and  the  common  errors  in  speakers'  descriptions  were 
categorized.  The  study  led  to  the  development  of  techniques  for  avoiding  failures  of 
reference  that  were  employed  in  the  reference  identification  component  of  a  natural 
language  understanding  program 


The  traditional  approaches  to  reference  identification  in  natural  language 
systems  were  found  to  be  less  flexible  than  people  s  real  behavior.  In  particular, 
listeners  often  find  the  correct  referent  even  when  the  speaker's  description  does  not 
describe  any  object  in  the  world.  To  model  a  listener  s  behavior,  a  new  component 
was  added  to  the  traditional  reference  identification  mechanism  to  resolve  difficulties 
in  a  speaker's  description.  This  new  component  uses  knowledge  about  linguistic  and 
physical  context  in  a  negotiation  process  that  determines  the  most  likely  places  for 
error  in  the  speaker's  utterance  The  actual  repair  of  the  speaker’'^'  description  is 
achieved  by  using  the  knowledge  sources  to  guide  relaxation  techniques  that  delete  or 
replace  portions  of  the  description.  The  algorithm  developed  more  closely 
approximates  people's  behavior  than  reference  algorithms  designed  in  the  past.  The 
next  section  describes  in  more  detail  my  work  on  reference. 


6.2  Reference 


5 


u 


jt  ’ 


Communication  involves  a  series  of  utterances  from  a  speaker  to  a  hearer  The 
hearer  uses  these  utterances  to  access  his  own  knowledge  and  the  world  around  him. 
Some  of  these  utterances  are  noun  phrases  that  refer  to  objects,  places,  ideas  and 
people  that  exist  in  the  real  world  or  in  some  imaginary  world  They  cannot  be 
considered  in  isolation.  Cor  example,  consider  the  utteiance  "Give  me  that  thing"  It 
can  be  uttered  in  many  different  situations  and  can  result  in  different  referents  of 
"that  thing."  Understanding  such  referring  expressions  requires  the  hearer  to  take 
into  account  the  speaker  s  intention,  the  speakei  s  overall  goal,  the  beliefs  of  the 
speaker  and  hearer,  the  linguistK  con'  <*,  the  physical  context,  and  the  syntax  and 


■j 


6-2 


m 


I 


I 


I 


I 


Report  No.  6636  BBN  Laboratories  Inc. 

semantics  of  the  current  utterance.  The  hearer  could  misinterpret  the  speaker's 
information  in  any  one  of  these  parts  of  communication.  Such  misunderstandings 
constitute  miscommunication.  In  my  research  1  focused  primarily  on  effects  of  the 
linguistic  context  and  the  physical  context. 

To  explore  such  reference  problems,  the  following  method  was  devised  and 
followed.  First,  protocols  of  subjects  communicating  about  a  task  were  analyzed. 
Knowledge  that  people  used  to  recover  from  reference  miscommunications  -  knowledge 
about  the  world  and  about  language  -  was  then  isolated  Algorithms  were  designed  to 
apply  a  person  s  knowledge  about  linguistic  and  physical  context  to  determine  the 
most  likely  places  for  error  in  the  speaker  s  utterance  Then,  computer  programs 
were  written:  (1)  to  represent  a  spatially  complex  physical  world,  (2)  to  manipulate 
the  structure  of  that  representation  to  reflect  the  changes  caused  by  the  listener's 
interpretation  of  the  speaker's  utterances  and  by  physical  actions  to  the  world.  (3)  to 
perform  referent  identification  on  noun  phrases,  and,  when  referent  identification 
failed,  (4)  to  search  the  physical  world  for  reasonable  candidates  for  the  referent. 
These  programs  form  one  component  of  a  natural  language  system. 

One  goal  in  this  summary  of  my  research  is  to  illustrate  how  my  views  on 
reference  id '..tification  departed  from  views  held  by  other  researchers  in  artificial 
intelligence.  Another  goal  is  to  show  where  ray  research  fits  in  the  scheme  of  natural 
language  understanding  by  computers  My  last  goal  is  to  summarize  the  approach  of 
my  research. 

6.3  A  new  reference  paradigm  from  a  computational  viewpoint 


% 

n 

& 


fUj 

I 

sc 


Reference  identification  is  a  search  process  where  a  listener  looks  for  something 
in  the  world  that  satisfies  a  speaker's  uttered  description  A  computational  scheme 
for  performing  such  reference  identifications  has  evolved  from  work  by  other  artificial 
intelligence  researchers  (e.g.,  see  [6]).  That  traditional  approach  succeeds  if  a 
referent  is  found,  or  fails  if  no  referent  is  found  (see  Figure  6- 1(a))  However,  a 
reference  identification  component  must  be  more  versatile  than  those  previously 
constructed.  The  excerpts  provided  in  [3]  jhow  that  the  traditional  approach  is 
inadequate  because  people  s  real  behavior  is  much  more  elaborate.  In  particular, 
listeners  often  fmd  the  correct  referent  •  ev^n  when  the  speaker's  description  does  not 
describe  any  object  in  the  world.  For  example,  a  speaker  could  describe  a  turquoise 

6-3 


BBN  Laboratories  Inc. 


Report  No.  6636 


block  as  the  "blue  block."  Most  listeners  would  go  ahead  and  assume  that  the 
turquoise  block  was  the  one  the  speaker  meant  since  turquoise  and  blue  are  similar 
colors. 


A  key  feature  to  reference  identification  is  "negotiation."  Negotiation  in 
reference  identification  comes  in  two  forms.  First,  it  can  occur  between  the  listener 
and  the  speaker.  The  listener  can  step  back,  expand  greatly  on  the  “speaker’s 
description  of  a  plausible  referent,  and  ask  for  confirmation  that  he  has  indeed  found 
the  correct  referent.  For  example,  a  listener  could  initiate  negotiation  with  "1  m 
confused  Are  vou  talking  about  the  thing  that  is  kind  of  flared  at  the  top'’  Couple 
inches  long.  It  s  kind  of  blue."  Second,  negotiation  can  be  with  oneself.  This  self- 
negotiation  IS  the  one  that  1  was  most  concerned  with  in  this  research.  The  listener 
considers  aspects  of  the  speaker  s  description,  the  context  of  the  communication,  the 
listener's  own  abilities,  and  other  relevant  sources  of  knowledge.  He  then  applies  that 
deliberation  to  determine  whether  one  referent  candidate  is  better  than  another  or.  if 
no  candidate  is  found,  what  are  the  most  likely  places  for  error  or  confusion.  Such 
negotiation  can  result  in  the  listener  testing  whether  or  not  a  particular  referent 
works.  For  example,  linguistic  descriptions  can  influence  a  listener's  perception  of 
the  world  The  listener  must  ask  himself  whether  he  can  perceive  one  of  the  objects 
in  the  world  the  way  the  speaker  described  it.  In  some  cases,  the  listener's 
perception  may  overrule  parts  of  the  description  because  the  listener  can't  perceive 
it  the  way  the  speaker  described  it. 


I 

I 

I 

I 

I 


To  repair  the  traditional  approach  1  developed  an  algorithm  that  captures  for 
certain  cases  the  listener's  ability  to  negotiate  with  himself  fjr  a  referent.  It  can 
search  for  a  referent  and.  if  it  doesn't  find  one.  it  can  try  to  find  possible  referent 
candidates  that  right  work,  and  then  loosen  the  speaker's  description  using 
knowledge  about  the  speaker,  the  conversation,  and  the  listener  himself  Thus,  the 
reference  process  becomes  multi-step  and  resumable  This  computational  model, 
which  1  call  "FWIM"  for  "Find  What  I  Mean  ",  is  mere  faithful  to  the  data  than  vhe 


I  traditional  model  (see  Figure  6-l(b)l. 

I 

I 

I 


One  means  of  making  sense  of  a  failed  description  is  to  delete  or  replace  the 
portions  that  cause  it  not  to  match  objects  in  the  hearer  s  world.  In  my  program  1 
am  using  "relaxation"  techniques  to  capture  this  behavior.  My  reference  identification 
module  treats  descriptions  as  approximate.  It  relaxes  a  description  in  order  to  find  a 
referent  when  the  literal  content  of  the  description  fails  to  provide  the  needed 


^3 


vV 


i 


7' 


td 


ns 


n 


V'. 


K 


!<■ 


6-4 


Report  No.  6636 


BBN  Laboratories  Inc. 


Figure  6-1;  Approaches  to  reference  identif 'cation 
information.  Relaxation,  however,  is  not  performed  blindly  on  the  description.  1  try 
to  model  a  person's  behavior  by  drawing  on  sources  of  knowledge  used  by  people.  1 
have  developed  a  computational  model  that  can  relax  aspects  of  a  description  using 
many  of  these  sources  of  knowledge  Relaxation  then  becomes  a  form  of 
communication  repair  (in  the  style  of  the  work  on  repair  theory  found  in  [l])-  A  goal 
in  my  model  is  to  use  the  knowledge  sources  to  reduce  the  number  of  referent 

candidates  that  must  be  considered  while  making  sure  that  a  particular  relaxation 
makes  sense.  A  brief  description  of  it  follows. 

The  component  works  by  first  selecting  with  a  partial  matcher  a  set  of 
reasonable  referent  candidates  for  the  speaker  s  description  (see  also  [7]),  The 
candidates  are  selected  by  searching  the  knowledge  base,  scoring  partial  matches  of 
each  candidate  to  the  speaker's  description,  and  selecting  those  with  higher  scores. 
The  component  then  generates,  using  information  from  the  knowledge  sources,  a 

relaxation  ordering  graph  that  describes  the  order  to  relax  features  in  the  speaker  s 
description.  Finally,  it  combines  the  candidates  with  the  ordering  to  yield  the  most 
likely  referent  An  ordered  relaxation  of  parts  of  the  speaker's  description  can  be 
provided  by  consulting  knowledge  known  about  linguistics  (the  actual  form  of  the 
speaker  s  utterance),  perception  (physical  aspects  of  the  world  and  the  listener's 

ability  to  distinguish  different  feature  val  .es  in  that  world),  specificity  (hierarchical 
knowledge  to  judge  how  vague  or  specific  a  particular  feature  value  is),  and  others. 
In  other  words,  the  algorithm  attempts  to  show  how  a  listener  might  judge  the 

importance  of  the  features  specified  in  a  speaker’s  description  using  knowledge  about 
linguistic  and  physical  context  Figure  6-2  illustrates  this  process.  The  speaker's 
description  is  represented  at  the  top  of  the  figure.  The  set  of  specified  features  and 
their  assigned  feature  value  (e.g..  the  pair  Color  Maroon)  are  also  shown  there.  A  set 
of  objects  in  the  real  world  are  selected  by  the  partial  matcher  as  potential 


BBN  Laboratories  Inc. 


Report  No.  6636 


candidates  for  the  referent  These  candidates  are  shown  near  the  top  of  the  figure 
(C.,  C» .  C  ).  Inside  each  box  is  a  set  of  features  and  feature  values  that  describe 

14  n 

that  object.  A  set  of  partial  orderings  are  generated  that  suggest  which  features  in 
the  speaker's  description  should  be  relaxed  first  -  one  ordering  for  each  knowledge 
source  (shown  as  ''Linguistic."  "Perceptual."  and  "Hierarchical"  in  the  figure)  For 
example,  linguistic  knowledge  recommends  relaxing  Color  or  Shape  before  Function, 
and  relaxing  Function  before  Size.  A  control  structure  was  designed  that  takes  the 
speaker's  description,  puts  all  the  (partial)  orders  together,  and  then  attempts  to 
satisfy  them  as  best  it  can  This  is  illustrated  at  the  bottom  of  the  diagram  by  the 
reordered  referent  candidates 

< 


Spiikir* 

DiicrlptikA 


round«4  mtrotn  dtvict 
tliit  II  lirgi* 


Riprniiitil 

oiicriftioi 


CMir  AiTMa 
IMu:  Nm 
laMlIM.-  IIVIMj 
IIW  ifH 


*1 

Candldatt  Objtctt 


CAltf,  BrM«t 

CalMi  M 

|C«l»r '  M 

BMt*.-  iMM 

tiMt' 

IlftMt  trltt 

TtM 

1-  1 

Raordirad 
Candldatt  Objtcts 

|C«l«r .  M 


li 


Ciiir  <  siiai  <  functiin  <  Siti 
Nraprail 

—  Cilir  v  SMiii  <  fwictliii  <  Siti 
Uifolnta 

^  Cilir  <  smpi  ar  Functiin  ir  Sill 
IlinrcPIud 


li 


Vf vavt 

r«i«ria - 

trt«fe 

1 

1 

Figure  6-2;  Reordering  referent  candidates 


6.4  Summary 

My  goal  in  this  work  is  to  build  robust  natural  language  understanding  systems, 
allowing  them  to  detect  and  avoid  miscommunication  The  goal  is  not  to  make  a 
perfect  listener  but  a  more  tolerant  one  that  could  avoid  many  mistakes,  though  it 
may  still  be  wrong  on  occasion  In  this  summary  of  my  research.  1  indicated  that 
problems  can  occur  during  communication  1  showed  that  reference  mistakes  are  one 
kind  of  obstacle  to  robust  communication  To  tackle  reference  errors.  1  described  how 


6-6 


'  iw  1  ib  ^  -aw  ‘  Ai-' 


I  ^  Ra  ,I  iKS  t,  S  &5S  IBK  rJiSJ  5[  .P  SS  I 


Report  No.  6636 


BBN  Laboratories  Inc. 


to  extend  the  succeed/'fail  paradigm  followed  by  previous  natural  language 
researchers. 

1  represented  real  world  objects  hierarchically  in  a  knowledge  base  using  a 
representation  language.  NIKL,  that  follows  in  the  tradition  of  semantic  networks  and 
frames.  In  such  a  representation  framework,  the  reference  identification  task  looks 
for  a  referent  hy  comparing  the  representation  of  the  speaker's  input  to  elements  in 
the  knowledge  base  by  using  a  matching  procedure  Failure  to  find  a  referent  in 
previous  reference  identification  systems  resulted  in  the  unsuccessful  termination  of 
the  reference  task  1  claim  that  people  behave  better  than  this  and  explicitly 
illustrated  such  cases  in  an  expert-apprentice  domain  about  toy  water  pumps  [3]. 

1  developed  a  theory  of  relaxation  for  recovering  from  reference  failures  that 
provides  a  much  better  model  for  human  performance.  When  people  are  asked  to 
identify  objects,  they  appear  to  behave  in  a  particular  way.  find  candidates,  adjust 
as  necessary,  re-try,  and,  if  necessary,  give  up  and  ask  for  help  1  claim  that 
relaxation  is  an  integral '  part  of  this  process  and  that  the  particular  parameters  of 
relaxation  differ  from  task  to  task  and  person  to  person.  My  work  models  the 
relaxation  process  and  provides  a  computational  model  for  experimenting  with  the 
different  parameters.  The  theory  incorporates  the  same  language  and  physical 
knowledge  that  people  use  in  performing  reference  identification  to  guide  the 
relaxation  process.  This  knowledge  is  represented  as  a  set  of  rules  and  as  data  in  a 
hierarchical  knowledge  base.  Rule-based  relaxation  provided  a  methodical  way  to  use 
knowledge  about  language  and  the  world  to  find  a  referent.  The  hierarchical 
representation  made  it  possible  to  tackle  issues  of  imprecision  and  over-specification 
in  a  speaker's  description.  It  allows  one  to  check  the  position  Of  a  description  in  the 
hierarchy  and  to  use  that  position  to  judge  imprecision  and  over-specification  and  to 
suggest  possible  repairs  to  the  description 

Interestingly,  one  would  expect  that  "closest  "  match  would  suffice  to  solve  the 
problem  of  finding  a  referent.  1  showed,  however,  that  it  doesn't  usually  provide  you 
with  the  correct  referent  Closest  match  isn't  sufficient  because  there  are  many 
features  associated  with  an  object  and.  thus,  determining  which  of  those  features  to 
keep  and  which  to  drop  is  a  difficult  problem  due  to  the  combinatorics  and  the  effects 
of  context.  The  relaxation  method  described  circumvents  the  problem  by  using  the 
knowledge  that  people  have  about  language  and  the  physical  world  to  prune  down  the 
search  space. 


I 


BBN  Laboratories  Inc.  Report  Nc.  6636 

6.5  Future  directions 

The  FWIM  reference  identification  system  1  developed  models  the  reference 
process  by  the  classification  operation  of  NIKL  1  need  a  more  complicated  model  for 
reference  That  model  might  need  a  complete  identification  plan  that  requires  making 
inferences  beyond  those  provided  by  classification  The  model  could  also  require  the 
execution  of  a  physical  action  by  the  listener  before  determining  the  proper  referent. 
Cohen  gives  two  excellent  examples  of  such  reference  plans  (pg.  101,  [2]).  The  first, 
"the  magnetic  screwdriver,  please,"  requires  the  listener  to  place  various  screwdrivers 
against  metal  to  determine  which  is  magnetic  The  second,  "the  three  two-inch  long 
salted  green  noodles""  requires  the  listener  to  count,  examine,  measure  and  taste  to 
discover  the  proper  referent 


I 

i 

i 


6-8 


Report  No.  6636 


BUN  Lsborsiories  Inc. 


I 


I 


i 


ACKNOWLEDGEMENTS 


This  research  was  supported  in  part  by  the  Center  for  the  Study  of  Reading 
under  Contract  No.  400-81-0030  of  the  National  Institute  of  Education  and  by  the 
Advanced  Research  Projects  Agency  of  the  Department  of  Defense  under  Contract  No. 
N00014-85-C-0079. 

1  want  to  thank  especially  Candy  Sidner  for  her  insightful  comments  and 
suggestions  during  the  course  of  this  work  I  d  also  like  to  acknowledge  the  helpful 
(  omments  of  Mane  Macaisa  and  Man  Vilain  on  this  paper  Special  thanks  also  to  Phil 
Cohen.  Scott  Fertig  and  Kathy  Starr  for  providing  me  with  their  water  pump  dialogues 
and  for  their  invaluable  observations  on  them. 


fid 


6-9 


Report  No.  6636 


BBN  Laboratories  Inc. 


REFERENCES 


1.  Brown,  John  Seely  and  VanLehn,  Kurt  "Repair  Theory  A  Generative  Theory  of 
Bugs  in  Procedural  Skills".  CognUivt  Science  4.  4  (1980),  379-426. 

2.  Cohen.  Philip  R  "The  Pragmatics  of  Referring  and  the  Modality  of  Communication" 
Computational  Linguistics  tV.  2  (April-June  1984),  97-146. 

3.  Goodman,  Bradley  A  Communication  and  Miscommunication  Ph.D.  Th.,  University 
of  Illinois,  Urbana,  11  ,  1984  Also  Report  No. 5681,  BBN  Laboratories  Inc.,  Cambridge, 

Ma  . 

4.  Goodman.  Bradley  A  Repairing  Reference  Identification  Failures  by  Relaxation. 
Proceedings  of  the  23rd  Annual  Meeting  of  the  Association  for  Computational 
Linguistics,  Chicago.  Illinois,  July.  198w.  pp  204-217. 

6.  Goodman.  Bradley  A  "Reference  Identification  and  Reference  Identification 
Failures”.  Computational  Linguistics  12,  4  (October-December  1986),  273-305, 

6.  Grosz,  Barbara  J  The  Representation  and  Use  of  Focus  in  Dialogue 
Understanding.  Ph.D  Th.,  University  of  California,  Berkeley,  Ca.,  1977,  Also,  Technical 
Note  151,  Stanford  Research  Institute,  Menlo  Park,  Ca. 

7.  Joshi,  Aravind  K.  A  Note  on  Partial  Match  of  Descriptions.  Can  One  Simultaneously 
Question  (Retrieve)  and  Inform  (Update)?  Theoretical  Issues  in  Natural  Language 
Processing-2,  Urbana,  111.,  July.  1973,  pp.  184-186. 


6-11 


UWV  /Kn  SJ\  ltj%  A.A  7SA  IKTi  .VI  KTi  'KA  H 


BBN  Laboratories  Inc. 


Report  No.  6636 


gaa  saj  OsSS  SS53  5JK5  SSi  Kll  SaJ  I:KS 


Report  No.  6636 


BBN  Laboratories  Inc, 


7.  POINTING  THE  WAY;  A  UNIFIED  TREATMENT  OF  REFERENTIAL  GESTURE  IN  INTERACTIVE 
DISCOURSE 


Erhard  Hinnchs  and  Livia  Polanyi 


Abstract 

In  this  paper,  we  argue  that  a  complete  model  of  interactively  constructed 
natural  discourse  must  provide  a  principled  account  of  deictic  gesture  which 
establishes  reference  to  non-linguistic  objects,  properties  and  relations. 
More  specifically,  we  shall  demonstrate  that  in  order  to  account  for  the 
contextual  relevance  of  linguistic  units  such  as  words,  phrases  and 
sentences,  an  adequate  discourse  model  must  include  (1)  a  compositional 
syntax  and  semantics  at  the  sentence  level  which  is  capable  of  dealing  with 
fragmentary  linguistic  input  and  (2)  a  discourse  component  which  accepts 
deictic  gestures  along  with  traditional  linguistic  units  as  input  and  assigns 
the  correct  context  of  interpretation  to  each  structure  parsed. 


7.1  Introduction 

The  necessity  of  including  an  analysis  of  non-verbally  encoded  information 
became  clear  to  us  when  we  started  analyzing  a  c  orpus  of  Spatial  Planning  Protocols 
collected  for  the  purpose  of  analyzing  how  actual  speakers  interactively  constiuct 
plans.  Vie  soon  discovered  that  failure  to  take  non-verbal  information  into  account 
resulted  in  misunderstanding  the  nature  of  the  communication  being  analyzed. 
Repeatedly,  we  found  that  we  did  not  understand  what  was  going  on  in  our  data  from 
examining  audio  material  alone.  We  needed  the  videotapes  of  the  protocol  collection 
sessions  to  provide  us  with  vital  information  about  what  was.  in  fact,  happening  in  the 
interaction  between  our  research  subjects. 


The  protocol  collection  sessions  involved  playing  a  game  called  "Travelling 
through  Europe",  Two  subjects  playing  together  against  a  researcher  were  given  a  set 


BBN  Laboratories  Inc. 


Report  No.  6636 


of  nine  European  cities  and  a  game  board  which  consists  of  a  map  of  Europe  marked 
with  over  one  hundred  city  names  joined  together  by  lines  representing  legal  routes. 
The  task  of  the  subjects  was  to  plan  the  most  efficient  route  —  one  which  would 
allow  them  to  visit  all  nine  cities  on  their  itinerary  in  the  smallest  number  of  steps 
"Playing  the  game"  involved  planning  an  itinerary  and  then  taking  turns  throwing  a 
die  and  moving  a  marker  on  the  board  the  number  of  city  steps  corresjxmding  to  the 
number  shown  on  the  die  Updating  and  changing  plans  was  allowed  at  any  time 

7.2  Semantic  Interpretation  of  Gesturally  Suppleted,  Verbally  Incomplete  Verbally 
Incomplete  Propositions 

Consider  the  piece  of  discourse  in  (1),  an  example  taken  from  the  corpus  of 
spatial  planning  protocols.  Without  the  accompanying  pointing  gestures  made  by  B. 
which  in  (1)  are  set  off  m  bold-face  and  by  curly  brackets,  we  might  well  characterize 
B’s  functioning  in  this  piece  of  discourse  as  inarticulate  and  indecisive 

(1)  A.  we  have  two  points  left 
B  OKAY 

So  (we  can  go  to 

A  (We  might  as  well  use  them 

to  go.  ))B’s  finger  at  Genoatt 
))B's  finger  moves  from 
piece  at  Genoa  to  Zurich. (| 

B.  We  could  go  to  ... 

h.  m  =  ))band  off  Zuricbii 

A  reasonably  correct  analysis  of  this  data  is  only  possible  when  the  non-verbal 
information  is  taken  into  account  When  Bs  gestures  are  considered  part  of  the 
signifying  mechanism  he  is  employing,  it  becomes  deal  that  B.  far  from  producing 
"incomplete"  proposition  carrying  units  and  adding  little  to  the  planning  process,  is 
actively  suggesting  a  very  definite  course  of  action  He  is  proposing  that  the  players 
should  choose  a  route  which  lakes  them  from  Genoa  to  Zurich'. 


Report  No.  6636 


BBN  Laboralo.-ies  Inc. 


B's  Gesture 


On  the  observational  level  the  example  demonstrates  the  importance  of 
integrating  linguistic  and  non-linguistic  information  for  the  process  of  (computational) 
discourse  understanding  since  without  the  gestural  information  the  discourse  is 
semantically  incomplete.  For  a  theory  of  discourse  understanding  the  example  raises 
at  least  the  following  issues. 

1  An  adequate  discourse  parser  has  to  be  able  to  accept  as  input  elliptical 
sentences  such  as  He  covld  go  to  in  our  example  and  augment  the  semantic 
interpretation  of  such  sentence  fragments  with  the  semantic  interpretation  of 
referential  gestures  which  supplete  the  verbal  part  of  such  utterances 

2.  Once  an  elliptical  sentence  fragment  in  combination  with  its  gestural 
suppletive  has  been  correctly  parsed,  interpretive  structures  are  needed  to  determine 
what  function  such  a  unit  of  discourse  plays  for  the  discourse  as  a  whole. 

We  will  argue  in  this  paper  that  a  discourse  model  which  is  based  on  the 
Lingvstic  Discovrse  Model  developed  in  Polanyi  and  Scha  (1984)  and  Polanyi  (1985)  can 
provide  a  principled  account  of  how  to  provide  a  compositional  syntax  and  semantics 


7-3 


BBN  Laboratories  Inc. 


Report  No.  6636 


for  individual  sentences  of  a  discourse  (including  elliptical  sentences)  and  how  to 
assign  meaning  to  individual  sentences  in  the  context  of  a  discourse  as  a  whole 


7.3  A  Compositional  Syntax  and  Semantics  tor  Sentences  m  Discourse 

Let  us  first  consider  that  component  of  our  discourse  model  that  deals  with  the 
interpretation  of  individual  sentences,  in  particular  sentence  fragments.  We  agree 
with  a  growing  number  of  researchers  both  in  the  field  of  theoretical  linguistics  and 
in  the  field  of  artificial  intelligence  that  the  syntax 'semantics  interface  of  a  grammar 
should  account  for  the  data  in  terms  of  a  grammar  formalism  that  is  as  restrictive  as 
possible  in  its  generative  power  and  hence  in  its  computational  complexity.  At  the 
same  time  we  take  it  to  be  the  fundamental  task  of  a  semantic  theory  of  natural 
language  to  relate  in  a  systematic  fashion  linguistic  expressions  to  real-world  objects, 
relations  and  states-of-affairs.  In  particular,  we  follow  Richard  Montague  (cf. 
Montague  1970.  1073)  and  others  in  assuming  that  the  meaning  of  sentences  should  be 
given  at  least  partly  in  terms  of  the  conditions  that  would  make  them  count  as  true  in 
a  given  world  or  in  some  state  of  affairs  Moreover,  we  assume  with  Montague  and 
others  that  the  semantic  composition  of  any  linguistic  expression  is  to  be  conceived  of 
as  a  homomorphic  image  of  its  syntactic  composition.  That  is,  we  assume  Montague  s 
interpetation  of  Frege's  Principle  which  says  that  the  meaning  of  a  syntactically 
complex  expression  has  to  be  derived  in  terms  of  the  meanings  of  its  syntactic  parts 

For  the  syntactic  analysis  ot  individual  sentences  we  adopt  for  a  our  discourse 
model  the  version  of  categorial  grammar  developed  by  Steedman  (1985)  and  by 
Steedman  and  Ades  (1983)  Our  decision  to  adopt  their  syntactic  framework  is 
motivated  by  the  following  considerations.  Categorial  grammars  are  restrictive  in  their 
generative  capacity  As  shown  by  Friedman  et  al  (1985),  depending  on  the  type  of 
syntactic  functors  permitted,  categorial  grammars  of  the  type  Steedman  (1985) 
discusses  generate  at  best  only  context-free  languages  and  at  worst  only  the  set  of 
context-sensitive  languages.  Moreover,  as  Steedman  himself  points  out,  his  categorial 
grammars  provide  a  natural  mechanism  for  left-to-right  parsing  of  an  input  stream, 


Report  No.  6636 


BBN  Laboratories  Inc. 


which  IS  particularly  attractive  for  computational  purposes/*  Finally,  and  most 
relevant  for  the  purposes  of  this  paper.  Steedman  s  version  of  categorial  grammar 
offers  a  principled  treatment  of  syntactically  incomplete  utterances  such  the  elliptical 
Wr  rould  go  1o  in  th»  discourse  example  that  we  are  analv^ing  in  this  paper 

Relying  crucially  on  Steedman's  notion  of  functional  composition,  we  assign  the 
parse  in  (2l  to  the  elliptical  utterance  We  could  go  to  which  in  our  example  is 
suppleted  by  a  referential  gesture. 

(2)  We  could  go  to 

S/FVP  FVP/VP  VP  VP/VP, /VP 


(FPC) 


(FPC) 


VP/NP 


(FPC) 


Apart  from  the  usual  operation  of  functional  application  or  Forward  Combination  (FC) 
in  a  categorial  grammar,  which  can  be  stated  as  in  (3).  Steedman  and  Ades  allow  for 
operations  of  partial  combination  or  functional  composition.  Among  such  partial 
combination  operators,  they  define  Forward  Partial  Combination  (FPCI  as  in  (4). 


(3)  X./Y  Y 


(FFA) 


(4)  X/Y  Y  Z  X/Z  (FPC) 


It  IS  the  operation  of  Forward  Partial  Combination  that  ailows  us  first  to  combine 
the  verb  go  in  (2)  with  its  prepositional  modifier  to.  even  though  the  preposition  is 
lacking  jts  NP  argument  and  then  to  combine  the  resulting  phrase  go  to  with  the 
remaining  words  in  the  sentence. 


Let  us  consider  next  how  elliptical  sentence  ns  in  (2)  can  be  interpreted 
together  with  their  accompanying  deictic  gestures.  For  the  semantic  interpretation  of 


Tor  the  use  of  extended  cotegoriol  grommor  for  porsing  noturol  longuoge  see  olso 
Wittenburg  (1986). 


BBN  Laboratories  Inc. 


Report  No.  6636 


space  and  time,  which  plays  a  central  role  in  a  task  domain  such  as  spatial  planning, 
we  adopt  the  approach  to  spatial  and  temporal  information  which  1  have  developed  in 
my  dissertation,  Hinrichs  (1985)  In  particular,  we  adopt  from  this  earlier  work  the 
semantics  of  motion  verbs  and  directional  modifiers  which  is  needed  for  the 
interpretation  of  (2).  We  treat  motion  verbs  as  siagi  level  predicates  in  the  sense  of 
Carlson  (19771,  namely  as  predicates  whose  arguments  refer  to  stages  of  individuals, 
which  are  interpreted  as  the  space-time  locations  occupied  by  the  individuals. 
Following  Carlson,  individuals  are  connected  to  their  stages  in  terms  of  a  realization 
relation  R.  which  associates  a  given  individual  with  all  of  the  (spatio-temporal) 
locations  at  which  that  individual  is  present.  Motion  verbs  such  as  go  are  interpeted 
in  terms  of  two-place  predicates:  whose  first  argument  position  ranges  over  individual 
stages  which  realize  the  referent  of  the  subject  NP  combining  with  go.  The  rightmost 
argument  position  ranges  over  event  stages  realizing  the  event  that  the  referent  of 
the  subject  NP  is  engaged  in  We  thus  adopt  the  strategy  of  Davidson  (1967)  to 
reserve  in  the  argument  structure  of  action  verbs  an  extra  argument  position  which 
refers  to  events,  or  more  specifically  event  stages  Directional  prepositions  such  as  to 
function  semantically  as  modifiers  which  specify  further  the  location  at  which  the 
event  associated  with  an  action  verb  takes  place  In  particular,  the  translation  of  to 
into  the  interpretation  language  introduces  the  notion  of  a  spatio-temporal  path 
between  some  implicit  point  of  origin,  denoted  by  the  free  variable  l^.  occunng  in  the 
formula  in  (5)  and  some  point  of  destination,  which  is  represented  by  the  referent  of 
the  NP  that  to  combines  with.  Given  this  analysis  of  go  and  its  directional  modifier  to, 
the  elliptical  sentence  in  (2)  is  translated  into  the  formula  in  (5l  ^ 

(5)  X  P  P  [  X  y"  O  3  e'.l^.lj.lj  i  Rd^.y®)  k  Rd^  we  )  &  Rdj.e')  k  go'^dj.lj) 
k  PATH  dj.l,,!,)  ]  ] 

The  variable  P  in  (5)  is  meant  to  range  over  prop.jrties  of  properties  of  individuals. 
Thus,  the  formula  denotes  the  set  of  such  complex  properties  of  some  object  denoted 
by  y°  whose  spare-time  location  1^  form.s  the  potential  destination  of  some  path  1^ 
which  originates  at  some  reference  point  l^  and  which  is  traversed  during  an  event  of 
some  individuals  denoted  by  we  going  from  l^,  to  the  location  of  y° 


^The  O  symbol  in  (5)  is  meont  to  represent  the  propost ionol  operotor  of  epistemic 
possibility,  08  stondordly  used  in  modol  logic.  The  meoning  of  this  operotor  Is  most 
riosely  represented  in  English  by  the  modol  COJI,  rother  than  coitld.  Hence,  the  tronslotion 
in  (5)  somewhot  oversimplifies  the  counterf octuol  modol  force  of  (2). 


7-6 


wirtBMsxiantinuPMiLi  r>  f 


Report  No.  6636 


BBN  Laboratories  Inc. 


i 

I 


I 


I 


If  (5)  represents  the  translation  of  (2),  the  deictic  gestures  accompanying  the 
elliptic  utterance  can  be  interpreted  in  the  following  way.  the  formula  in  (5)  is 
deficient  in  two  respects,  the  reference  point  r  has  to  be  specified  and  the  object  y 
has  to  be  identified  which  has  among  its  properties  that  it  represents  the  end-point 
of  some  path  originating  at  1^.  The  deictic  gestures  identify  Genoa  and  Zurich  as  the 
two  objects  representing  the  reference  point  and  the  point  of  destination  respectively. 
Since  the  property  that  "ties  these  two  objects  together"  in  (5)  is  the  notion  of  a 
path  and  since  both  objects  represent  space-time  locations  .  the  most  salient 
interpretation  for  the  gesture  is  that  it  indicates  a  path  between  the  two  cities. 
Combining  th®  referents  of  the  deictic  gestures  with  the  formula  in  (5),  we  arrive  at 
the  translation  in  (6) 

t6)  O  3  e\l^.l2.l3  [  R(l^ .Zurich')  <Sc  R(l2,we')  &c  Rllj.e')  k  go‘''(l2.l3)  k  PATH 

In  the  context  of  the  game,  the  potential  path  identified  by  the  two  players  does,  of 
course,  not  stand  for  a  real-wjild  journey  between  two  cities  in  Switzerland,  but 
rather  for  a  particular  move  in  a  game  of  route  optimization  The  ronversational 
function  of  the  elliptical  utterance  can  be  explicated  in  terms  of  the  Linguistic 
Discourse  Model  (LDMl  model  by  providing  the  appropriate  elements  of  discourse 
grammar  relative  to  which  the  utterance  has  to  be  interpreted. 


7.4  The  Conversational  Function  of  the  Elliptical  Utterance 

To  elucidate  how  interpretive  contexts  may  be  arrived  at,  we  shall  make  use  of 
the  Linguistic  Discourse  Model  -  a  theory  of  discourse  structure  which  provides 
theoretical  notions  and  formal  machinery  necessary  to  assign  B's  gesturally  suppleted 
utterance  to  the  relevant  contexts.  The  LDM  ,  a  theory  of  the  structural  and  semantic 
relations  obtaining  among  clauses  in  discourse,  is  formulated  as  a  Discourse  Parser 
The  unit  of  input  to  the  parser  is  normally  a  linguistically  realized  clause  encoding  a 
single  proposition  .  but  may  also  be  a  non-verbally  suppleted  utterance  such  as  (1)  or 
a  non-verbal  action  sequence  such  as  the  pointing  gesture  in  (14)  below. 

Before  beginning  the  LDM  analysis,  let  us  first  consider  briefly  some  of  the 
factors  which  we  intuitively  take  into  account  in  interpreting  B's  proposition  as  a 


i  I 

to!' 


7-7 


BBN  Laboratories  Inc. 


Report  No.  6636 


proposal  that  a  route  from  Genoa  to  Zurich  be  tdken  in  the  game  which  they  are 
playing 

o  A  and  B  are  engaged  in  an  interaction  with  each  other 

o  They  constitute  a  team  playing  the  "Game  Travelling  through  Europe"  as  part 
of  an  experiment 

o  A  and  B  play  this  game  cooperatively.  They  agree  together  to  moves  which 
are  acceptable  to  both  and  which  they  believe  to  be  permitted  by  the  rules 
of  the  game. 

o  It  is  A  and  B's  turn  in  the  game. 

o  After  the  die  is  thrown  and  it  is  clear  how  many  points  are  available  to 
them.  A  and  B  have  to  agree  upon  a  course  of  action 

o  Agreeing  upon  a  course  of  action  involves  a  negotiation  in  which  proposals 
and  counterproposals  are  made. 

o  Putting  some  course  of  action  on  the  table,  is  a  possible  first  step  in  a 
negotiation  sequence. 

o  The  presence  of  the  modal  might  in  the  verbally  encoded 

It’e  might  as  •utell  use  them  to  go  to  signals  proposal  to  perform  the  action 
specified  in  the  embedded  phrase 

o  The  verbally  encoded  phrase  is  suppieted  by  B  s  use  of  his  finger  to  connect 
two  dots  on  the  gameboard  construed  in  the  game  as  representing"  cities". 

o  The  beginning  point  of  B’s  tracing  motion  is  at  the  dot  marked  "Genoa"  and 
the  trace  ends  at  the  dot  marked  "Zurich",  A&B  s  playing  token  is  located 
av  "Genoa"  as  the  turn  begins  The  number  of  steps  to  "Zurich"  is  two, 
which  IS  the  number  thrown  on  the  die  a  moment  earlier. 

The  LDM  provides  a  formal  mechanism  for  capturing  these  intuitive  conceptions  of  what 
IS  happening  at  the  time  of  B  s  gesturally  suppieted  utterance 


V.5  The  LDM 


The  LDM  parser  describes  a  discourse  as  built  up  by  means  of  a  sequencing  and 
recursive  embedding  of  discourse  con  itituents  The  Parser  proceeds  on  a  clause  by 
clause  left  to  right  basis  through  the  discourse  assigning  each  constituent  clause  to 
appropriate  interpretive  contexts  and  assigning  to  the  discourse  as  a  whole  an 
incrementally  constructed  structural  description 

7-B 


■  ft  ■  rt  f  nun  Lr»Lr» 


LJTi  UTT  \rn  wn  'him 


Report  No.  6636 


BBN  Laboratories  Inc. 


In  real  language  use,  utterances  occur  in  real  or  modelled  Interactions  and  are 
used  to  carry  out  various  interactive  business,  so  called  Speech  Events  (Hymes  1972). 
In  the  LDM  which  models  language  structure  at  the  discourse  level,  there  are  similarly 
no  uncontextualized  clauses.  Clauses  in  discourse  are  linked  together  in  discourse 
constituent  units  tdcu's)  of  one  or  more  clauses  and  used  to  build  up  the  genre  units 
-  here  called  "Discourse  Units'  -  which  speakers  use  to  develop  conventionally 
structured  semantically  coherent  accounts  of  the  states  of  affairs  in  some  Modelled 
Discourse  World 

Under  an  LDM  analysis,  the  propositional  information  encoded  in  a  given  "clause" 
receives  its  interpretation  relative  to  a  hierarchy  of  contexts  deriving  from  units  of 
the  several  types  (dcu,  DU,  Speech  Event  and  Interaction  (  Speech  Events  are  further 
divided  into  "Moves".  Each  unit  type  is  further  subdivided  into  sub-types  of  various 
sorts. ^  Each  unit  type  has  an  associated  ,>-,rammar  which  specifies  its  legal 
constituents  and  their  permissable  orders.^  Grammars  for  the  Turn  and  Negotiation 
Sequence  are  discussed  below.  (See  figures  7  and  8  below.) 

In  assigning  a  structural  description  to  a  developing  discourse  on  a  clause  by 
clause  left  to  right  basis,  the  LDM  makes  use  of  these  grammars  together  with 
semantic  and  structural  information  about  the  propositional  content  and  contexts  of 
occurrence  of  the  given  clause.  The  required  information  about  the  clause's  content 
derives  ultimately  from  the  proposition  it  encodes,  while  contextual  information 
reflects  specific  higher  level  contextualizing  units  in  which  the  incoming  clause 
participates. 


■^The  "clause"  is  o  one  element  dcu.  "Sequent  i  ol " ,  "and  expansion"  dcu’s  ore  the  two  most 
important  classes  of  complex  dcu.  The  "story  "  "argument"  end  "proposol"  ore  relatively 
common  DU's,  while  there  ore  Speech  Event  units  os  various  os  conversations,  consultations 
with  the  villoge  elder,  bridge  parties,  and  gomes  of  bridge. 


'often  these  grammars  ore  simple  phrase  structure  grammars,  although  more  complex  grammars 
ere  also  needed  to  copture  the  structure  of  same  types  of  units. 


7-9 


BBN  Laboratories  Inc. 


Report  No.  6636 


7.6  Structural  Relations  in  Discourse 

Di.scourse  constituents  are  related  structurally  to  one  another  by  coordination 
and  subordination  The  LDM  is  fully  recursive  and  tonstituents  nf  all  types  are  legally 
embeddable  within  one  another  Interruptions,  elaborations,  asides  and  parentheticals 
are  uniformly  treated  as  discourse  subordinations  because  they  disturb  the  orderly 
development  of  some  ongoing  discourse  activity  Discourse  Coordination  is  more 
constrained  and  is  effectively  limited  to  Topic  chains,  narratives,  and  other  list  dcu 
structures  as  well  as  sequences  of  moves  in  Discourse  Units  and  Speech  Events  and 
sequences  of  independent  Interactions  , 


7.7  Discourse  Parsing  with  the  LDM 

In  order  to  process  a  discourse,  the  discourse  parser  makes  use  of  an  inventory 
of  individual  Type  and  sub-Type  Grammars,  calling  upon  parsers  associated  with  them 
as  needed  to  process  constituents  of  different  sorts  There  is  no  limit  to  the  number 
of  times  an  individual  parser  might  be  made  use  of  to  process  a  given  discourse,  nor 
IS  there  any  constraint  placed  on  the  order  in  which  those  parsers  must  be  called. 
The  frequency  of  calls  to  any  parser  and  the  order  of  calls  depends  entirely  on  the 
nature  of  the  individual  discourse. 

The  LDM  Parser  builds  the  Discourse  History  Parse  Tree  for  a  given  discourse  by 
coordinating  and  subordinating  incoming  clauses  into  discourse  constituent  units  at 
existing  or  newly  created  rightmost  nodes  in  the  tree  Only  rightmost  nodes  are 
structurally  accessible 

By  comparing  the  content  and  di.scourse  unit  contexts  of  the  incoming  unit  with 
the  information  available  at  open  nodes,  the  Parser  determines  whether  (1)  to 
coordinate  the  target  utterance  to  the  immediate  preceding  constituent  as  a  sister 
node,  (2t  to  add  it  to  the  tree  as  a  rightmost  sister  to  some  higher  level  constituent 
or, (3)  to  subordinate  it  to  an  existing  accessible  constituent  available  somewhere  in 


7-10 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


I 


the  tree 

The  LDM  framework  thereby  resolves  an  apparently  insoluble  problem  in  discourse 
analysis  Anything  can  happen  in  any  discourse  and  therefore  a  theory  of  discourse 
structure  must  account  for  the  highly  individual  and  possibly  unexpected  structure  of 
any  given  discourse  while,  simultaneously,  accounting  for  the  fact  that,  at  env  given 
moment,  speakers  are  normally  quite  clear  about  the  kind  of  discourse  octivity 
underway  and  have  very  definite  expectations  about  what  is  likely  to  happen  next 

These  expectations  about  how  the  discourse  will  proceed  captured  in  the 
grammars  of  the  various  unit  types  are  exploited  in  the  LDM's  assigning  to  B's 
utterance  the  status  of  a  PROPOSAL. 


7.8  Analyzing  the  Discourse  Context  of  B’s  Utterance 


I 


I 


W 


i  , 

ns 


When  B's  utterance  is  encountered  by  the  parser,  it  has  just  finished  dealing 
with  the  previous  utterance  and  has  assigned  to  It'c  have  two  points  left  a  set  of 
interpretive  contexts  reflecting  its  current  state.  These  contexts,  shown  in  (7),  are 
occasioned  by  the  throw  of  the  die  during  one  of  A&B  s  turns  at  Play  in  the 

Speech  Eventpi^yi^^  the  gome  "Trovell  in,  Through  Europe"  >^self  part  of  a 

Speech  iment  taking  place  during  a  unique  spatin./temporal/ social 


^Coordinot  ion  to  o  previous  unit  located  ot  o  .still  occessible  node  in  the  Discourse  Tree 
is  permitted  only  if  the  two  units  ore  constituents  of  the  some  set  of  higher  level  units 
ond  then  only  if  the  Grommor  of  the  lowest  level  common  unit  specifies  thot  the  incoming 
unit  con  legally  function  os  o  next  const i tuent .Subordinot ion  is  the  oefoult  structurol 
relotion  ond  obtoins  (1)  between  on  incoming  douse  ond  o  structurally  occessible  dcu  if  the 
volues  ossocioted  with  the  proposi t ionol  content  of  the  clause  beor  on  i nstont i ot i on 
relotion  to  (ot  leost  a  subset  of)  the  semontic  context  volues  of  the  candidate  mother  node 
or  (2)  between  o  douse  ond  the  lost  unit  porsed  if  the  semantic  volues  ossocioted  with  the 
proposit ionol  volue  of  the  douse  hove  no  semontic  relation  with  the  semontic  volues  ot  ony 
ovoiloble  node.  (See  section  10.  below)  For  the  purposes  of  the  present  onolysis,  we  will 
be  concerned  only  with  discourse  coordination. 


7-11 


^  -  _ 

I 

BBN  Laboratories  Inc.  Report  No.  6636 


Interaction^^pigi^  Cont*xts^ 


6 


(7)  <Interaction 


Koplon  Cont«xtt^ 

.Speech 

<Speech  Eventpj^yj^^  Trovelling  through  Europe 
<Move, 


Toke  turns 
<Sub-Move^y^^  A*B 

•'Sub-Move^^^^^  die 


>>>>:> 


According  to  the  Grammar  of  AirB's  Turn  m  (8). 

(8)  TURN  grammar 

Move.|.yi,^  - >  Throw  Die  +  Negotiate  Action  +  Move  Counter 

the  parser  now  expects  A  and  B  to  Negotiate  a  course  of  action  to  take  in  deciding 
what  "route"  to  use  in  accomplishing  the  part  of  their  Game  World  Journey  which 
would  advance  them  towards  their  goal  According  *0  the  grammar  of  negotiation 
shown  in  (9),  the  first  part  of  any  Negotiation  Sequence  in  this  game  is  a  Proposal,  for 
what  to  do  relative  to  the  position  of  the  players'  piece  on  the  map  game  board; 

(9)  NEGOTIATION  GRAMMAR 

Movejg^^^^ - >  Make  Proposal  +  (Discussion  of  Proposal) 

+  (Counter  Proposal 
+  (Discussion  of  Counter  Proposal)]* 

+  One  Proposal  Accepted 


Expecting  a  Route  Proposal,  the  parser  easily  professes  B's  gesturaily  suppleted 
utterance  as  such  a  Proposal  since  it  conveys  appr  pnate  propositional  information, 


^Koplon  Contexts  (see  Koplon  ms,)  specify  reol  world  temporal,  spatial,  and  participant 
indices.  "Koplon  context^"  is  being  used  to  identify  the  unique  utterance  situation  which 
took  piece  at  o  unique  reol  world  spot  lot  Location^  (Conference  Room,  BBt'  Lobs,  Combridge. 
MA) .  ot  Temporol  Index^  (November  4,  1986,  10:00  -  10:15  AM),  involving 
Port ic Iponts^  B  E  H  L  P 


S 


I 


nm 


to 


n 


L7-12 

kill  wit  un  uvL'WLrif 


f 


Report  No.  6636 


BBN  Laboratories  Inc. 


I 


and  IS  encoded  according  to  the  syntactic  conventions  appropriate  for  signalling 
possibility.  The  parser  assigns  the  formula  in  (6)  the  interpretive  contexts  shown  in 
(10). 


do)  •'.Interaction 


Koplon  Contexts 


<Speech  Even4^p^^i^^„, 

<.bpeech  Eventpipyj^^  Trovelling  through  Europe 
Turns 


';Sub-Move  Turn 


AkB 


<Discourse  Unit 
<dcu 


Negotiotion  of  route  to  toke 

,•>>>>>> 


Proposo I ' 

These  contexts  localize  B’s  gesturally  suppleted  clause  as  a  unique  utterance  relative 
to  unique  circumstances  of  utterance  and  are  used  by  the  LDM  to  compute  how  the 
encoding  clause  participates  in  the  tree  structure  of  the  emerging  discourse. 


In  order  to  assign,  We  couU:  go  to.  or  any  other  incoming  clause  a  position  in 
the  Discourse  History  Parse  Tree,  the  LDM  parser  attempts  match  the  contexts  of 
the  present  utterance  with  those  of  the  immediately  preceding  utterance  We  have  two 
points  left  (shown  above  in  7). 


These  contexts  are  av  able  in  the  tree  of  the  developing  discourse  as  the  label 
at  the  node  immediately  dominuiing  the  terminal  clause  node  as  shown  in  Figure  11. 


(ll.) 


<1  <2<3<4<TURN»» 


<1<2<3<4<6<THROW  0ICE»»»> 


“W#  hivt  two  points  loft’ 


In  the  present  case,  therefore,  the  first  five  contexts  match 


o  Interaction 


Koplon  Context 


0  Speech  .Tventg^p^^i^^^, 

O  Speech  laying  Trove  Ming  Through  Europe 

o  Move^ 


"Complete  Turns 
0  Sub-MoveT^,„ 


7-13 


BBN  Laboratories  Inc. 


r-.iport  No,  6636 


u 


However,  when  processing  Context  6  of  H‘e  could  go  to,  the  parser  is  unable  to  effect  a 
match,  Context  6  of  the  preceding  unit  --  <Throw  die>  --  does  not  match  Context  6 
of  the  present  clause  which  is  •'Negotiation  of  Route  >,  At  this  point,  with  reference 
to  the  state  of  the  discourse  as  reflected  in  the  parse  tree,  the  parser  must  make  use 
of  the  grammars  of  the  discourse  units  curre'  under  construction  and  the  context 
information  encoded  at  open  nodes  to  decide  .  nether  to  subordinate,  coordinate,  or 
superordinate  the  incoming  unit  at  the  node  corresponding  to  Context  5  in  the  tree. 


The  decision  process,  in  this  case  is  not  complicated  Because  the  higher  level 

interpretation  contexts  match  and  because  "Negotiating  a  route  to  take"  is  an 

appropriate  next  constituent  to  follow  "Throw  Die"  according  to  the  Grammar  of  A  & 

B's  Turn,  (see  7  above)  We  could  go  to  is  coordinated  with  We  have  two  points  left 

under  a  coordination  node  carrying  the  values  of  the  five  matched  contexts  as 

illustrated  in  Figure  12. 

(12)  <1<2<3<4<TUAN»»> 


I 


'IVt  fiavt  n*o  points  isfl' 


<1<2<3<4<PROPOSAI,»»> 


*W«  might  as  wall  usa  tham  to  gs 
(FROM  GENOA  TO  ZURICH)" 

Carrying  the  analysis  one  step  further,  we  can  now  account  for  the  relevance  of 
A  s  next  remark  why  not  just  go  to  Lyons  and  we'd  be  on  our  way  to  Orleans?  In  the 
context  where  it  occurs.  A  s  comment  is  commonplace  and  fully  coherent.  "Lyons  is 
seen  as  a  counterproposal  to  Bs  gesturally  communicated  proposal  to  follow  a  route 
from  Genoa  to  Zurich 


Viewed  in  terms  of  the  LDM  framework.  As  behavior  is  predictable  from  the 
Grammar  of  Negotiation.  Following  B's  Proposal,  A  makes  the  next  Move  allowed  by 
that  Grammar  and  utters  a  complex  clausal  construction  which  functions  in  the 
ongoing  context  as  a  counter-proposal  to  B  Since  contexts  1-6  of  the  two  utterances 
are  the  same,  as  is  shown  in  Figure  13.  the  LDM  when  processing  Why  no  just  go  to 
Lyons  and  we'd  be  on  our  way  to  Orleans.,  will  eventually  coordinate  it  to  We  could  go 
to  [Zurich  from  Genoa]  under  a  node  with  values  •.  1-6  -  on  the  Discourse  History 
Parse  Tree 


i 


7-14 


(3? 

.1 


S' 


I 


i 


r''* 

.i' 


t.'-i 


jf. 


Report  No.  6636 


BBN  Laboratories  Icc. 


<1  <2<3<4<TURN>»» 


<1<a<3<4<5«e<PfK)POSAL»»>»  <1<2<3<4<5<«<COUNTERPROPOSAU>>>>>> 

might  ti  vmi  utt  thtm  -my  not  ftat  go  to  Lyont 

to  go  (FROM  GENOA  TOZURIChO"  gnti  woV  bo  on  our  woy  to  Orlaans" 


10.  ESTABISHING  THE  CONTEXTUALLY  CONSTRAINED  INTERPRETATION  OF 
B’S  UTTERANCE 

In  order  to  establish  the  contextually  constrained  inter  pretation  to  be  accorded 
a  given  unit  in  a  discourse,  each  unit  under  an  LDM  analysis  is  associated  with  a 
semantic  frame  which  gives  relevant  semantic  information  about  the  unit  in  terms  of  a 
number  of  semantic  indices  which  specify  parameters  such  as  temporal,  spatial,  goal 
and  participant  information  abstracted  from  the  unit  s  content 

Formally  defined  operations  on  the  semantic  parameters  of  low  level  constituents 
yield  the  parameters  for  higher  level  constituents  which  contextualize  them.  Lower 
level  constituents  may  only  participate  in  high  level  constituents  if  the  semantic 
values  of  at  least  a  subset  of  the  lower  level  unit  are  related  systematically  to  the 
values  of  a  higher  level  unit  located  at  an  open  node  on  ihe  tree  (Polanyi  1985) 

The  semantic  parameters  of  purely  linguistic  units  (clauses,  dcu's,  and  Discourse 
Units’)  are  set  relative  to  the  interpretation  accorded  a  unit's  propositional  content 

For  Interactions  the  indices  in  the  semantic  frame  are  real  world  Kaplan 
Contexts  as  indicated  below,  while  for  Speech  Events,  the  relevant  semantic  dimensions 
concern  Speech  Event  roles,  activities,  and  concerns.  Temporal  and  spatial  aspects  of 
Speech  Events  are  set  relative  to  the  activity  at  hand.  Thus,  at  the  Speech  Event 
level,  the  seme  physical  location  used  for  an  "Experiment”  is  defined  differently  from 
that  same  physical  location  used  to  hold  an  "Auction" 

Therefore,  the  participant  set  (P.  iyer  1.  Player  2,  Player  3)  of  the 

7-15 


BBN  Laboratories  Inc. 


Report  No.  6636 


Speech  Evenlpi^y.^^  Through  Europe  systematically  to  a  subset  of 

the  participants  playing  roles  in  the  higher  level  Speech  Event^j^p^^,  (Experimenter 

1,  Experimenter  2,  Research  Subject  1,  Research  Subject  2)  The  role  playing 

participants  of  the  Experiment  Speech  Event  are  similarly  related  to  a  proper  subset 

of  the  participants  of  the  Interaction  Kop'on  Context  ^  ^  ^  *  indivdual 

A  in  the  real  world  of  the  Interaction  is  defined  relative  to  his  Speech  Eventr  . 

txp6r  im^nt 

role  as  Research  Subject,  and  to  his  role  in  Speech  Eventp,^yi^^  Travelling  Through 
Europe  ' 

Space  and  Time  in  the  lower  level  units  are  likewise  established  with  reference  to 
!  the  spatial  and  temporal  parameters  of  higher  level  contexts.  The  Spatial  parameter 

associated  with  the  A&B's  Turn,  for  example,  is  set  relative  to  the  Spatial  parameters 
of  the  contextualizing  higher  level  unit  --  the  Complete  Turn.  The  Spatial  parameters 
J  of  the  Complete  Turn  include  all  possible  routes  for  both  teams  while  the  spatial 

1  parameters  for  Turn^^g  include  only  possible  routes  for  A&B’s  Gameworld  surrogate, 

For  the  example  in  question,  therefore,  the  possible  interpretation  of  the  spatial 

locations  referred  to  in  Wc  could  go  from  "Zurich”  from  "Genoa"  is  restricted  by 

I  context  computations  to  the  Genoa  and  Zurich,  on  the  game  board  and  cannot  refer  to 

!  the  "Genoa"  and  "Zurich"  in  the  real  world,  on  any  other  map  or  relevant  to  any 

other  world  of  discourse  He  is  similarly  interpreted  as  He  the  surrogates  associated 

with  He  the  Players  associated  with  He  A&B  in  the  Real  World  in  which  the  Interaction 
f. 

I  took  place 

\ 

i 

7.9  A  Gestural  Proposal 

Consider  next  the  example  in  (14)  in  which  Bs  gestures  do  not  supplete  a 
partially  verbalized  proposition  but  function  independently  to  convey  meaning 

(14)  A.  ...  a  .nd  then  we've  covered  most  of  our  ground 

right  there 

HB  points  at  berlinjl 

We  ve  still  got  Berlin  jj  B  moves  to  Prague(j 


7-16 


Report  No.  6636 


BBN  Laboratories  Inc. 


to  hit  ))B's  finger  at  Prague}} 

B  Prague 

A  .  And  some  place  called  Lodz 

If  one  has  access  only  to  the  record  of  the  spoken  interaction,  the  talk  seems 
somewhat  incoherent  at  that  point. ^  As  was  the  case  with  example  (1).  in  order  to 
ascertain  the  coherence  and  relevance  of  the  surrounding  linguistic  material  correctly, 
sentential  syntactic  and  semantic  analyses  must  again  be  augmented  by  an 
understanding  of  the  discourse  context.  When  B  s  gestures  are  considered  part  of  the 
signifying  mechanism  he  is  employing,  it  becomes  clear  that  far  from  adding  little  to 
the  planning.  B  is  actively  suggesting  a  very  definite  course  of  actnn  By  pointing  to 
Berlin  and  then  to  Prague,  B  makes  nonverbal  proposals  —  suggesting  that  the 
players  should  include  Berlin  and  then  Prague  on  their  itinerary,  A  s  uttering  ''Berlin” 
and  the  "Prague”  can  now  be  understood  as  accepting  B  s  proposal  to  include  those 
cities  in  that  order  in  their  itinerary. 

The  LDM  analysis  of  (14),  B's  pointing  to  Berlin  on  the  game  map  is  similar  to  the 
analysis  of  example  1.  The  point  is  interpretated  as  a  proposal  to  visit  Berlin  after 
the  cities  A  has  specified  B's  pointing  gesture  receives  this  interpretation  because 
the  pointing  takes  place  as  the  discourse  parser  is  processing  a  discourse  unit  of 
type  PLAN  (Linde  and  Goguen,  1978)  and  expects  either  (1)  a  reaction  to  A  s  proposal 
—  either  accepting,  rejecting  or  adding  to  it  or  (2)  an  initiation  or  resumption  of  an 
unrelated  unit  In  this  structural  position,  deictic  reference  to  a  city  known  to  be  on 
the  itinerary  is  interpreted  as  adding  to  the  proposal  a  suggestion  to  visit  that  city 
next  B  s  point  to  Berlin  thus  functions  as  a  proposal  that  Berlin  be  visited  next 


^Lacking  knowledge  of  B's  non-verbal  proposal,  the  researcher  can  nc  help  but  wonder 
where  and  how  "BerMn"  is  encoded  into  A’s  cognitive  map  of  Europe.  a  had  been  dealing 
systemat icol ly  with  geogrophical  areas  quite  far  removed  from  the  center  of  Europe  and 
"suddenly",  with  an  intonationol  change,  switched  his  attention  to  "Berlin".  Without 
understondi  ng  thot  A’s  comnent  is  a  response  to  B's  action  it  is  not  at  oil  clear  why  A 
switches  gears  so  suddenly  or  what  principle  of  geographical  organization  he  was  using  ta 
structure  his  itinerary  plan. 


BBN  Laboratories  Inc. 


Report  No.  6636 


7.10  Conclusion 

We  have  argued  in  this  paper,  then,  that  to  be  able  to  account  for  the  full 
meaning  and  relevance  of  utterances  in  discourse  a  theory  of  discourse  structure 
must  be  able  to  provide  an  account  of  both  verball 

y  and  non-verbally  encoded  information  We  have  shown  how  a  formal  treatment  of 
sentential  syntactic  and  semantic  phenomena  when  augmented  with  a  discourse  model 
capable  of  assigning  interpretive  contexts  to  naturally  occuring  talk  can  be  used  to 
help  us  understand  the  full  force  of  even  a  fragmentary  utterance  in  a  world  of  real 
world  language  use. 


7-18 


Report  No.  6636 


BBN  Laboratories  Inc. 


REFERENCES 


Ades,  A.  E.  and  M.  J.  Steedman  (1982).  On  the  Order  of  Words'.  Linguistics  aiid 
Philosophy  Vo'.  4  4.  pp.  51T-558 

Davidson.  Donald  (19671  'The  Logical  Form  of  Action  Sentences  .  In.  Rescher. 

Nicholas  ed.  The  Logic  of  Decision  and  Action  Pittsburgh.  University  of 
Pittsburgh  Press,  pp.  81-95. 

Friedman.  Joyce.  Dawai  Dai  and  Weiguo  Wang  (19861.  The  HcaA:  Generative  Capacity  of 
Parcnthcsis-Frcc  Categorial  Grammars  Boston  University  Computer  Science 
Tech  Report  ft  86-001. 

Hinnch.s  E  (19851  A  Compositional  Semantics  of  NP  Reference  and  Aktionsarten  v\ 
English.  Unpublished  Ph.D  Dissertation.  Ohio  State  University 

Hinrichs.  E.  and  L  Polanyi  (to  appear)  Pointing  the  Way  Modelling  Non-Verbal 
Referential  Gesture  in  Natural  Language  Discourse  BBN  Technical  Report 
Bolt  Beranek  and  Newman  Labs:  Cambridge. 

Hymes,  D.  (1972).  'Models  of  the  Interaction  of  Language  and  Social  Life.'  In. 

J.  Gumperz  and  D.  Hymes  eds.  Directions  in  Sociolinguistics.  35-71  New  York. 
Holt.  Renehart  and  Winston 

Kaplan,  D.  (ms  )  'Demonstratives.  An  Essay  on  the  Semantics,  Other  Indexicals 
(unpublished.  University  of  California,  Los  Angeles). 

Linde,  C.  and  J.  Goguen  (1978).  The  Structure  of  Planning  Discourse'.  Journal  of 
Social  and  Biological  Structure  Vol  1,  pp  219-251 

Marslen-Wilson.  W.,  E.  Levy,  and  LK  Tyler  Producing  interpretable  Discourse'.  In 

Jarrella.  R.  and  W  Klein  eds  Speech.  Place,  and  Action.  Wiley  and  Sons  New 
York 

Montague.  Richard  (1970).  Universal  Grammar  Theona.  Vol  36.  pp.  373-398 

Montague,  Richard  (1973).  The  Proper  Treatment  of  Quantification  in  Ordinary 
English'.  In;  J.  Hintikka.  J.  Moravcsik.  and  P.  Suppes  eds.  Approaches  to 
Natural  Language.  Reidel.  Dordrecht, 

Polanyi,  L.  and  R.  Scha  (1984)  A  Syntactic  Approach  to  Discourse  Semantics'. 
Proceedings  of  Coling84.  Stanford  University  Stanford,  pp.  413-419 

Polanyi.  L.  (1985)  A  Theory  of  Discourse  Structure  and  Discourse  Coherence' 
Proceedings  of  the  21st  Meeting  of  the  Chicago  Linguistics  Society 
Department  of  Linguistics,  University  of  Chicago  Chicago.  Illinois, 

Steedman.  M.  J.  (1985).  'Dependency  and  Coordination  in  the  Grammar  of  Dutch  and 
English  Language  Vol  61.3,  pp  523-568 

Wittenburg.  Kent  (1986).  Some  Properties  of  Combinatory  Categorial  Grammars  of 
Rclci'ancc.  to  Parsing  MCC  Technical  Report  Number.  Hl-0l2~86. 


BBN  Laboratories  Inc. 


Report  No.  6636 


Report  No.  6636 


66N  Laboratories  Inc. 


8.  PUBUCATIONS 

B.  A.  Goodman.  "Reference  Identification  and  Reference  Identification  Failures." 
Computational  Linguistxc;>.  Vol  l.i.  No  -1.  198G  (a  revised  version  also  appears  as 
Rule-Based  Relaxation  of  Reference  Identification  Failures,  Technical  Report  No.  396, 
Center  for  the  Study  of  Reading,  University  of  Illinois  at  Urbana-Champaign,  1986). 

B.  A,  Goodman.  "Repairing  Reference  Identification  Failures  by  Relaxation,"  in 
Communication  Failure  in  Dialogue  and  Discourse.  Ronan  Reilly  (ed.),  North-Holland, 
1987  (also  in  Proceedings  of  the  23rd  Annual  Meeting  of  the  ACL.  July  1985). 

B.  A.  Goodman.  "Reference  and  Reference  Failures,"  in  Proceedings  of  Theoretical 
Issues  m  Natural  Language  Processing  -3,  TINLAP-3,  New  Mexico  State  University,  Las 
Cruces,  New  Mexico,  1987  (also  appears  as  Technical  Report  No.  398,  Center  for  the 
Study  of  Reading,  University  of  Illinois  at  Urbana-Champaign,  1986). 

Abstract 

The  goal  of  this  work  is  the  enrichment  of  human-machine  interactions  in  a 
natural  language  environment.  Because  a  speaker  and  listener  cannot  be  assured  to 
have  the  same  beliefs,  contexts,  perceptions,  backgrounds,  or  goals  at  ~3ch  point  in  a 
conversation,  difficulties  and  mistakes  arise  when  a  listener  interprets  a  speaker's 
utterance.  These  mistakes  can  lead  to  various  kinds  of  misunderstandings  between 
speaker  and  listener,  including  reference  failures  or  failure  to  understand  the 
speaker's  intention  We  call  these  misunderstandings  miscommunication.  Such  mistakes 
can  slow,  and  possibly  break  down,  communication.  Our  goal  is  to  recognize  and 
isolate  such  miscommunications  and  circumvent  them  These  papers  highlight  a 
particular  class  of  miscommunication  -  reference  problems  -  by  describing  a  case 
study  and  techniques  for  avoiding  failures  of  reference  We  want  to  illustrate  a 
framework  less  restrictive  than  earlier  ones  by  allowing  a  speaker  leeway  in  forming 
an  utterance  about  a  task  and  in  determining  the  conversational  vehicle  to  deliver  it. 
These  papers  also  promotes  a  new  view  for  extensional  reference. 


8-1 


if=«  v'n.-¥K.-hrtL  tai  »rj?  •. 


»  vn  w%  ' 


jTLTf  yw  -A  w'lTWV  w  \ 


BBN  Laboratories  Inc. 


Report  No.  8838 


A.  R.  Haas,  "A  Syntactic  Theory  of  Belief  and  Action,"  Artificial  Intelligence.  Vol. 
28,  No.  3,  May  1986. 


Abstract 


If  we  assume  th.:t  beliefs  are  sentences  of  first-order  logic  stored  in  an  agent’s 
head,  we  can  build  a  simple  and  intuitively  clear  formalism  for  reasoning  about  beliefs. 
1  apply  this  formalism  to  the  standard  logical  problems  about  belief,  and  use  it  to 
describe  the  connection  between  belief  and  planning 

E.  Hinrichs.  "A  Compositional  Semantics  for  Directional  Modifiers  in  English  - 
Locative  Case  Reopened,"  Proceedings  of  the  11th  International  Conference  on 
Computational  Linguistics,  pp.  347-349,  August  1986. 

Abstract 


This  paper  presents  a  model-theoretic  semantics  for  directional  modifiers  in 
English  The  semantic  theory  presupposed  for  the  analysis  is  that  of  Montague 
Grammar  (cf  Montague  1970,  1973)  which  makes  it  possible  to  develop  a  strongly 
compositional  treatment  of  directional  modifiers  Such  a  treatment  has  significant 
computational  advantages  over  case-based  treatments  of  directional  modifers  that  are 
advocated  in  the  Al  literature 

E.  Hinrichs  and  L  Polanyi,"  Pointing  the  Way.  A  Unified  Treatment  of  Referential 
Gesture  in  Interactive  Discourse,"  Proceedings  of  the  22nd  Annual  Meeting  of  the 
Chicago  Linguistics  Society,  1986. 

Abstract 


In  this  paper,  we  argue  that  a  complete  model  of  interactively  constructed 
natural  discourse  must  provide  a  principled  account  of  deictic  gesture  which 
establishes  reference  to  non-linguistic  objects,  properties  and  relations  More 
specifically,  we  shall  demonstrate  that  in  order  to  account  for  the  contextual 
relevance  of  linguistic  units  such  as  words,  phrases  and  sentences,  an  adequate 
discourse  model  must  include  (1)  a  compositional  syntax  and  semantics  at  the  sentence 
level  which  is  capable  of  dealing  with  fragmentary  linguistic  input  and  (2)  a  discourse 
component  which  accepts  deictic  gestures  along  with  traditional  linguistic  units  as 
input  and  assigns  the  correct  context  of  interpretation  to  each  structure  parsed 


Report  No.  6636 


BBN  Laboratories  Inc. 


L.  Polanyi,  "The  Linguistic  Discourse  Model  Towards  a  Formal  Theory  of 
Discourse  Structure,"  BBN  Technical  Report  No  6409.  November  1986. 

Abstract 

Despite  the  apparent  disfluency  and  disorganization  of  everyday  talk,  speakers 
all  but  flawlessly  recover  anaphoric  references,  interpret  temporal  and  spatial  deictic 
expressions  and  use  language  to  shape  utterances  which  demonstrate  a  clear  and 
recoverable  relationship  to  "the  business  at  hand"  in  the  talk  and  the  contextualizing 
social  setting  In  the  paper,  we  shall  present  a  comprehensive  formal  model  of 
discourse  structure,  the  Linguistic  Discourse  Model,  the  LDM,  which  provides  a  uniform 
account  of  how  speakers  accomplish  these  tasks  in  constructing  and  understanding 
both  maximally  coherent  and  highly  attenuated  discourse 

The  LDM  is  both  a  competence  model  of  linguistic  structure  above  the  sentence 
level  and  a  performance  model  In  the  paper,  we  shall  describe  the  linguistic 
discourse  structuring  resources  and  conventions  available  to  speakers  in  carrying  out 
communicative  and  interactional  tasks  and  demonstrate  how  these  resources  are  used 
in  actual  talk  to  create  the  complex  discourses  which  speakers  routinely  produce  and 
interpret.  In  our  view,  providing  an  adequate  account  of  discourse  structural 
relations  is  the  first  step  towards  what  we  believe  to  be  the  eventual  goal  of  formal 
work  in  discourse  understanding  -  the  development  of  a  system  capable  of  assigning  a 
proper  semantic  interpretation  to  every  clause  in  a  discourse. 

L.  Polanyi,  "A  Formal  Syntax  of  Discourse,"  Technical  Report  of  the  Center  for 
the  Study  of  Reading.  University  of  Illinois.  1986 

R.  J.  H  Scha.  B.  C.  Bruce  and  L.  Polanyi,  "Discourse  Understanding."  in 
LncvcTopedia  of  Artificial  Intelligence,  S.C.  Shapiro  (ed.i,  John  Wiley  and  Sons,  New 
York.  1986  talso  appears  as  Technical  Report  No  391,  Center  for  the  Study  of  Reading. 
University  of  Illinois,  1986). 

Abstract 

Research  on  natural  language  understanding  has  often  focused  on  the  problem  of 
analyzing  the  structure  and  meaning  of  isolated  sentences  To  understand  whole  texts 
or  dialogues,  these  sentences  must  be  seen  as  elements  whose  significance  resides  in 

8-3 


BBN  Laboratories  Inc. 


Report  No.  6636 


the  contribution  they  make  to  the  larger  whole.  A  computer  natural  language 
understanding  system  must  interpret  each  sentence  with  respect  to  both  the  linguistic 
context,  established  by  preceding  sentences  and  the  real-wcrld  setting.  This  paper 
reviews  work  on  these  issues,  examining  theories  of  the  structure  of  discourse,  the 
semantics  of  discourse,  speech  acts  and  pragmatics,  and  different  communication 
modalities. 


J.  G  Schmolze,  "Physics  for  Robots,"  Ph.D  Dissertation,  University  of 
Massachusetts,  February  1986  (also  a  revised  version  to  appear  as  BBN  Technical 
Report  No.  6222.  Fall  1987). 


C.  L.  Sidner,  "Intentions,  Attention  and  the  Structure  of  Discourse," 
Computational  Linguistics,  Vol.  12,  No.  3,  1986. 

Abstract 


In  this  paper  we  explore  a  new  theory  of  discourse  structure  that  stresses  the 
role  of  purpose  and  processing  in  discourse.  In  this  theory,  discourse  structure  is 
composed  of  three  separate  but  interrelated  components  the  structure  of  the 
sequence  of  utterances  (called  the  linguistic  structure),  a  structure  of  purposes 
(called  the  intentional  structure),  and  the  state  of  focus  of  attention  (called  the 
attentional  state).  The  linguistic  structure  consists  of  segments  of  the  discourse  into 
which  the  utterances  naturally  aggregate.  The  intentional  structure  captures  the 
discourse-relevant  purposes,  expressed  in  each  of  the  linguistic  segments  as  well  as 
relationships  among  them.  The  attentional  state  is  an  abstraction  of  the  focus  of 
attention  of  the  participants  as  the  discourse  unfolds.  The  attentional  state,  being 
dynamic,  records  the  objects,  properties,  and  relations  that  are  salient  at  each  point 
of  the  discourse.  The  distinction  among  these  components  is  essential  to  provide  an 
adequate  explanation  of  such  discourse  phenomena  as  cue  phrases,  referring 
expressions,  and  interruptions. 


iiS 


§ 

i 


The  theory  of  attention,  intention,  and  aggregation  of  utterances  is  illus.  rated  in 
the  paper  with  a  number  of  example  discourses  Various  properties  of  discourse  are 
described,  and  explanations  for  the  behavior  of  cue  phrases,  referring  expressions, 
and  interruptions  are  explored 


t?.*  I 


8-4 


-  ! 

) 

V 


Report  No.  6636 


BBN  Laboratories  Inc. 


This  theory  provides  a  framework  for  describing  the  processing  of  utterances  in 
a  discourse.  Discourse  processing  requires  recognizing  how  the  utterances  of  the 
discourse  aggregate  into  segments,  recognizing  the  intentions  expressed  in  the 
discourse  and  the  relationships  among  intentions,  and  tracking  the  discourse  through 
the  operation  of  the  mechanisms  associated  with  attentional  state.  This  processing 
description  specifies  in  these  recognition  tasks  the  role  of  information  from  the 
discourse  and  from  the  participants'  knowledge  of  the  domain. 

N.  S.  Sridharan.  "Semi-applicative  Programming  Examples  of  Context-free 
Recognizers."  BBN  Report  No.  6135,  January  1986. 

Abstract 

Most  current  parallel  programming  languages  are  designed  with  a  sequential 
programming  language  as  the  base  language  and  have  added  constructs  that  allow 
parallel  execution.  We  are  experimenting  with  an  applicative  base  langugage  that  has 
implicit  parallelism  everywhere,  and  then  we  introduce  constructs  that  inhibit 
parallelism.  The  base  language  uses  pure  LISP  as  a  foundation  and  blends  in 
interesting  features  of  Prolog  and  F'P.  Proper  utilization  of  available  machine 
resources  is  a  crucial  concern  of  programmers  We  advocate  several  technqiues  of 
controlling  the  behavior  of  functional  programs  without  changing  their  meaning  or 
functionality:  program  annotation  with  constructs  that  have  berugn  side-effects, 

program  transformation  and  adaptive  scheduling.  This  conbination  yeilds  us  a 
semi -applicative  programming  language  and  an  interesting  program  methodology. 

In  this  paper  we  deal  with  context-free  parsing  as  an  illustration  of  semi- 
applicative  programming  Starting  with  the  specification  of  a  context-free  recognizer, 
we  have  been  successful  in  deriving  variants  of  the  recognition  algorithm  of  Cocke- 
Kasami-Younger.  One  version  is  the  CKY  algorithm  in  parallel  The  second  version 
includes  a  top-down  predictor  to  limit  the  work  done  by  the  bottom-up  recognizer 
The  third  version  uses  a  cost  measure  over  derivations  and  produces  minimal  cost 
parses  using  a  dynamic  programming  technique  In  another  line  of  development,  we 
arrive  at  a  parallel  version  of  the  Earley  algorithm.  All  of  these  algorithms  reveal 
more  concurrency  than  was  apparent  at  first  glance 

M.  Vilain  (with  H  Kautz).  "Constraint  Propagation  Algorithms  for  Temporal 


8-5 


a 


] 


BBN  Laboratories  Inc. 


Report  No.  6636 


Reasoning.  Proceedings  of  the  Fifth  Annual  Conference  on  Artificial  Intelligence  (AAAI), 
Philadelphia,  PA,  pp.  3V7-382,  August  1986, 

Abstract 

This  paper  considers  computational  aspects  of  several  temporal  representation 
languages.  It  investigates  an  interval-based  representation,  and  a  point-based  one. 
Computing  the  consequences  of  temporal  assertions  is  shown  to  be  computationally 
intractable  in  the  interval-based  representation,  but  not  in  the  point-based  one. 
However,  a  fragment  of  the  interval  language  can  be  expressed  using  the  point 
language  and  benefits  from  the  tractability  of  the  latter 


hji 


8-6  W 


Report  No.  66'’5 


BBN  Laboratories  Inc. 


9.  PRESENTATIONS 

B.  Goodman,  "Miscommunication  and  Plan  Recognition,"  at  the  User  Modelling 
Workshop,  Maria  Laach.  West  Germany.  August  198Ci 

B.  Goodman,  "Reference  and  Reference  Failures,"  Theoretical  Issues  in  Natural 
Langauge  Processing  111  (T1NLAP3).  New  Mexico  State  University,  Las  Cruces,  New  Mexico, 
January  1987 

E.  Hinrichs.  "A  Compositional  Semantics  for  NP  Reference  and  Aktionsarten",  West 
Coast  Conference  on  Formal  Linguistics,  March,  1986. 

E.  Hinrichs  and  L.  Polanyi,  "Pointing  the  Way.  A  Unified  Treatment  of  Referential 
Gesture  in  Interactive  Discourse,"  at  the  22nd  Annual  Meeting  of  the  Chicago  Linguistic 
Society,  April  17-19,  1986. 

E.  Hinrichs,  "A  Compositional  Semantics  for  Directional  Modifiers  in  English  - 
Locative  Case  Reopened,"  at  the  11th  International  Conference  on  Computational 
Linguistics,  University  of  Bonn,  August  25-29.  1986. 

L.  Polanyi,  "Discourse  Analysis  from  a  Linguistic  Point  of  View,"  Boston 

Interaction  Research  Group,  January  29,  1986, 

L.  Polanyi.  "A  Linguistic  Approach  to  Discourse  Analysis,"  Mas.:achusetts 

Interdisciplinary  Discourse  Analysis  Seminar.  February  5.  1986. 

L  Polanyi,  "A  Formal  Model  of  Discourse  Structure."  invited  paper  at  2nd 
Cognitive  Science  Seminar.  Tel  Aviv  University,  April  1-6.  1986, 

L  Polanyi.  "Discourse  Syntax,  Discourse  Semantics.  Discourse  Semiotics  The  Case 
of  the  Discourse  Pivot,"  invited  lecture.  Cognitive  Science  Senes.  University  of  Buffalo, 
April  23,  1986. 


L.  Polanyi.  "Narrative  Organization  and  Disorganization,"  invited  paper.  Workshop 
Symposium  on  the  Acquisition  of  Temporal  Structures  in  Discourse,  University  of 
Chicago,  April  16,  1986. 


BBN  Laboratories  Inc. 


Report  No.  6636 


J.  Schmolze,  "Physics  for  Robots."  at  Brandeis  University.  March  1986. 

J.  Schmolze,  "Physics  for  Robots."  at  AAAl-86,  Philadelphia.  August  11-15,  1986. 

J.  Schmolze,  "Semantics  for  NIKL,"  MIT  Workshop  on  Terminologica.'  Languages, 
Cambridge,  MA,  July  1986 

C.  Sidner.  "Al,  Computational  Linguistics  and  Discourse  Theory,  Massachusetts 
Interdisciplinary  Discourse  Analysis  Seminar.  March,  1986 

C.  Sidner.  "Modelling  Discourse  Structure.  The  Role  of  Purpose  in  Discourse," 
invited  talk.  Sixth  Annual  Canadian  Al  Conference,  Montreal,  Canada,  May  198P 

C.  Sidner  (with  B.  Grosz),  "Plans  in  Discourse,"  SDF  Benchmark  Series  in 
Computational  Linguistics  111,  Plans  and  Intentions  in  Communication  and  Discourse, 
Monterey,  California,  March  1987. 

N.  S.  Sridharan,  "Semi-Applicative  Programming  Examples  of  Context  Free 
Recognizers,"  Technical  Report  No.  6135.  BBN  Laboratories  Inc  .  January  1986, 

N  S.  Sridharan,  Workshop  on  Future  Directions  in  Computing,  sponsored  by  the 
Army  Research  Office,  Seabrook  Island,  SC,  May  1986, 

N.S.  Sridharan.  Workshop  on  Artificial  Intelligence,  sponsored  by  Ministry  of 
Defense.  India,  in  Bangalore,  June  1986. 

M.B.  Vilain  (with  H.  Kautz),  "Constraint  Propagation  Algorithms  for  Temporal 
Reasoning."  at  AAAl-86,  Philadelphia,  August  11-15.  1986 

M  B  Vilain.  "Recent  and  Forthcoming  Developments  in  KL-TWO."  MIT  Workshop  on 
Terminological  Languages,  Cambridge,  MA.  July  1986 


Official  Distribution  List 


Contract  N00014-85-C-0079 


Copies 

Scientific  Officer  1 

Head,  Information  Sciences  Division 

Office  of  Naval  Research 

800  North  Quincy  Street 

Arlington,  VA  22217-5000 

Attn:  Dr.  Alan  L  Meyrowitz 


Mr.  Frank  Skieber  1 

Defense  Contract  Administration 
Services  Region  -  Boston 
495  Summer  Street 
Boston,  MA  02210-2184 


Director,  Naval  Research  Laboratory  1 

Attn:  Code  2627 
Washington,  DC  20375 


Defense  Technical  Information  Center  12 

Bldg.  5 

Cameron  Station 
Alexandria,  VA  22314 


